Geophysical process simulations play a crucial role in the understanding of the subsurface. This understanding is required to provide, for instance, clean energy sources such as geothermal energy. However, the calibration and validation of the physical models heavily rely on state measurements such as temperature. In this work, we demonstrate that focusing analyses purely on measurements introduces a high bias. This is illustrated through global sensitivity studies. The extensive exploration of the parameter space becomes feasible through the construction of suitable surrogate models via the reduced basis method, where the bias is found to result from very unequal data distribution. We propose schemes to compensate for parts of this bias. However, the bias cannot be entirely compensated. Therefore, we demonstrate the consequences of this bias with the example of a model calibration.

Understanding the subsurface is as important in the field of geosciences as understanding climatic processes. In this paper, we focus on the understanding of the subsurface temperature field, which is of major importance for geothermal applications. Here, we focus on numerical process simulations to improve our understanding of the subsurface. These simulations are based on both geological and physical models; however, in this paper we will primarily further investigate the latter. The physical model has two major sources of uncertainties arising from the physical processes itself (i.e., neglected processes, generalizations)

To compensate for both sources of uncertainties, one commonly performs model calibrations, either deterministically

The first problem related to the data distribution is the depth location of the individual measurements. Our geothermal models have a depth in the magnitude of 100 km. In contrast, our deepest thermal measurements are commonly at a depth of 5 to 7 km. The second problem is also related to data density. Focusing on the horizontal data distribution, we face the problem of data sparsity and unequal data distribution. In certain model areas, we have very few temperature measurements and in other areas, we have a much larger data density. This inequality can be compensated by using data-weighting schemes

The problem of data sparsity is widely recognized

In this paper, we aim to provide a systematic investigation of the bias induced by measurement distribution. Therefore, we perform global sensitivity analyses to determine the influence of the model parameters (i.e., thermal conductivity, radiogenic heat production) on the model response (i.e., temperature) within the spatial extent of the Alpine models. Sensitivity analyses can be subdivided into local and global analyses. We choose a global sensitivity analysis to investigate not only the influence of the parameters themselves but also the parameter correlations. Note that a local sensitivity analysis assumes that all parameters are independent of each other

Global sensitivity analyses have the disadvantage of being computationally very demanding since they require several thousand to several hundred thousand forward simulations. This makes these analyses infeasible even for state-of-the-art finite element problems. To compensate for the expensive nature of the method, we employ the reduced basis method to construct suitable surrogate models. The principle idea is to replace the original high dimensional model with a low dimensional model while keeping the key characteristic of the problem

In this paper, we investigate the problems related to data distribution for the case study of the Alpine region. The geological model, covering the Alpine orogen and its forelands, is taken from a previous study

In the following, we briefly introduce the concepts of global sensitivity analyses and the reduced basis method. Furthermore, we introduce the physical model and the temperature data used throughout this study.

In this study, we investigate the measurement bias and therefore require knowledge of which parameters the temperature distribution is sensitive to. Therefore, we employ a sensitivity analysis (SA). We distinguish two types of sensitivity analyses: local and global. The local sensitivity analysis investigates the influence of the model parameters with respect to a user-defined reference parameter set. All parameter variations are considered independently of each other and only the vicinity of the input parameters is explored, e.g., in a variation range of

For this case study, we are using a conductive heat transfer problem

In this work, we require a surrogate model that is representative of the entire temperature state to ensure the feasibility of the study. Therefore, we use the reduced-basis (RB) method for the surrogate model construction, a projection-based model order reduction technique. It aims to replace the original high-dimensional model with a low-dimensional representation while keeping the input–output relationship the same. Hence, the method preserves the underlying physics. One limitation of the RB method is that it is restricted to underlying low-dimensional parameter spaces. With higher-dimensional parameter spaces the complexity of the parameter space tends to increase, leading to longer construction times and surrogate model dimensions that are too large. The RB method destroys the sparsity pattern of the system, meaning that a large surrogate model will require a longer execution time than the original finite element model due to its dense nature. To overcome this issue, we use a hierarchical sensitivity study, as we will discuss in Sect.

The RB method is comprised of the following two parts: the offline and online stages. During the offline stage, we construct our surrogate model. This stage is computationally expensive but needs to be performed only once. In the online stage, we use the low-dimensional surrogate model. This stage is computationally fast and therefore ideal for expensive outer-loop processes such as the global sensitivity analysis. In previous studies, we showed that the RB method yields a speed-up of several orders of magnitude for the here-
described physical problem

All reduced models are generated with the software package DwarfElephant

We present the temperature data set in the form of a histogram in Fig.

Distribution of the measurements according to the geological layers. For the layer IDs, please refer to Table

Spatial distribution of the temperature measurements

The spatial distribution of measurements varies widely across the region, it is sparse in the Molasse Basin (103) and Alps (83) and ranges to being dense in the Po Basin (7619). In an effort to alleviate a significant bias and to improve the efficiency of the presented methods, the data set was filtered to give a more uniform measurement density across the region, with a significant reduction in the Po Basin (2028) whilst retaining those in the Molasse Basin (103) and Alps (83). Deeper measurements (

A common issue of the temperature data for the calibration of thermal models is their unequal distribution. To compensate for this inequality, we introduce a weighting scheme in this paper. There are different possibilities to weight the measurement data. In this paper, we use a regional weighting scheme that combines quantitative measures and our knowledge about the geophysical setting and the data quality. As previously mentioned, the data set was reduced to 2388 data points in total. We subdivide the model into the following four regions:

the Alps with 83 measurements,

the URG with 177 measurements,

the Molasse with 103 measurements,

and the Po Basin with 2025 measurements.

the Po Basin is not weighted,

the Molasse is weighted by a factor of 20 since the Po Basin contains 20 times more data points,

and the Upper Rhine Graben and the Alps are weighted by a factor of 0.5.

The weighting scheme is applied to the quantity of interest of the global sensitivity analyses. Here, we consider the L2 norm of the difference between the measured (

In this paper, we study two versions of the Alps model.

The first one focuses on the sediments and the lithospheric mantle. This model has been presented in

The second model concentrates on the upper crust and is denoted as the “Crustal-Focus Alps” model. This model contains 34 geological layers, and again each layer has a homogeneous and isotropic thermal conductivity and radiogenic heat production. For this second model, we have a higher number of geological layers because several layers of the “General-Focus Alps” model have been further subdivided, as demonstrated in Table

Both models have an extent of 640 km in the

At the top of both models we apply a Dirichlet boundary condition representing the annual average surface temperatures

For the reference thermal conductivity, we use a value of 3.0 W m

In this paper, in addition to the General-Focus Alps model already presented in

Schematic overview of the models used in this paper.

To avoid the problem of the parameter space dimension becoming too large, we perform a hierarchical global sensitivity analysis. The setup of the hierarchical sensitivity analysis is shown in Figs.

unconsolidated sediments and the lower crust (red rectangle of Fig.

unconsolidated and consolidated sediments (gray rectangle of Fig.

and the upper crust (blue rectangles of Fig.

Representation of the hierarchical process-focused sensitivity analysis of the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Schematic representation of the hierarchical global sensitivity analysis.

Each of these additional sensitivity analyses also contains a thermal parameter from the top-level sensitivity analysis to enable a comparison between all analyses. We investigate all thermal properties of the upper crust instead of only those that are above the threshold since the upper crust has been the primary interest in previous studies

In this paper, we want to investigate how much our analyses are influenced by focusing on measurements. This is important since we calibrate and validate our analyses with, for instance, temperature measurements. The sensitivity analysis investigates the relative changes that are induced by changes in the model parameters (i.e., thermal conductivity and radiogenic heat production). For the sensitivity analysis, we need to define a quantity of interest, which allows us to define with respect to what measure the changes are investigated. To investigate the influence of the measurements, we perform the hierarchical sensitivity analyses with two different quantities of interest for the General-Focus Alps model (branch 1.1 and 1.2 of Fig.

The first quantity of interest is defined as the sum of the absolute temperature values of the entire model. This results in a sensitivity analysis that is representative of the physical processes since all regions in the model are treated equally.

The second quantity of interest is defined as the absolute misfit between the simulated and measured temperature values. Hence, the resulting sensitivity analysis is focused on the temperature measurements.

In the following, we focus on the difference in the total order sensitivity indices between those two hierarchical sensitivity analyses (branch 1.1 and 1.2 of Fig.

Top-level sensitivity analysis (focusing on the entire Alps model) with different quantities of interest of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the unconsolidated sediments and lower crust with different quantities of interest of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the unconsolidated and consolidated sediments with different quantities of interest of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the upper crust with different quantities of interest of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Focusing on the difference between the hierarchical sensitivity analyses, we make two key observations.

We observe tendentiously higher difference for the thermal conductivities of deeper geological layers. This is highlighted in Fig.

The difference in the sensitivity indices tend to be larger for the radiogenic heat production than for the thermal conductivity. This is highlighted in Figs.

Furthermore, in the case of the process-focused analyses, the model is sensitive to more parameters and we obtain a slightly higher parameter correlation.

Here we focus on the difference observable for the analysis of the unconsolidated and consolidated sediments. We obtain huge differences in the sensitivities for both sediment types. For the thermal conductivities of the unconsolidated sediments, the measurement-focused analysis returns tendentiously higher influences, whereas for the consolidated sediments the process-focused analysis results in tendentiously higher influences of the thermal conductivities.

Finally, we switch our focus to the analysis of the upper crust. For the upper crust, we observe six thermal conductivities with a significant difference in the sensitivity indices:

Note that the layers of the upper crust (

The consequences of introducing a weighting scheme have been already partly addressed in

Analogous to the previous section, we focus on the differences in the total order sensitivity indices. Again, we perform the hierarchical analyses (Figs.

Top-level sensitivity analysis (focusing on the entire Alps model) with different weighting schemes of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the unconsolidated sediments and lower crust with different weighting schemes of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the unconsolidated and consolidated sediments with different weighting schemes of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the layer IDs and symbols, please refer to Table

Sensitivity analysis of the upper crust with different weighting schemes of the hierarchical global sensitivity analysis for the General-Focus Alps model. For the Layer IDs and symbols, please refer to Table

In contrast, we observe for the thermal conductivities of the Upper Rhine Graben layers a closer resemblance of the non-weighted scenario to the process-focused analysis (blue rectangle of Fig.

We also observe for the radiogenic heat production that for most layers the indices of the weighted case are closer to the process-focused analysis than the non-weighted case (red rectangles of Fig.

Before discussing the results of this paper, we briefly present the surrogate models obtained through the RB method in terms of cost and accuracy. In total, we consider five different surrogate models, as listed in Table

Overview of the various RB models. Here we present the focus of the different surrogate models and their dimensions.

We observe from Table

In the following, we discuss the consequences of focusing a study on measurements. Therefore, we discuss the changes in the sensitivities for the different quantities of interest and weighting schemes. Furthermore, we demonstrate the consequences through a deterministic model calibration example.

The different quantities of interest represent the bias introduced by the unequal distribution of the measurement locations. Hence, we can use the difference in the sensitivity analysis to discuss the bias that is induced by the temperature measurements. So far, we had two key observations for the study of the different quantities of interest:

the difference in the indices for the thermal conductivities is higher for deeper layers,

the differences are higher for the radiogenic heat productions than for the thermal conductivities.

Both of these observations can be explained by having a closer look at the depth distribution of the temperature measurements (Fig.

Distribution of the measurements according to depth.

We investigate the phenomenon more closely for the analysis of the unconsolidated and consolidated sediments. Here, we have a prominent overestimation of the influences of the unconsolidated sediments and an underestimation of the consolidated sediments, delineated as follows:

384 data points in the unconsolidated sediments of the Upper Rhine Graben above 1 km (

755 data points in the unconsolidated sediments of the Upper Rhine Graben below 1 km (

516 data points in the unconsolidated sediments of the Po Basin below 2 km (

318 data points in the Consolidated sediments outside of sedimentary basins (

18 data points in the consolidated sediments of the Molasse Basin (

63 data points in the consolidated sediments of the Po Basin (

The behavior is more pronounced for the radiogenic heat production for lithological reasons. The highest influences of the radiogenic heat productions arise from the upper crust (Fig.

The consequence of the data distribution becomes obvious once we look at the analysis of the unconsolidated sediments and lower crust (Fig.

In addition, for the analysis of the upper crust (Fig.

The influence of the radiogenic heat production of the Istrea and Ivrea upper crust is underestimated in the measurement-focused study due to the lack of data, whereas the influence of the radiogenic heat production of the Po, northeastern Adria, and southeastern Adria upper crust is overestimated. This is likely caused by the measurements available for both the Po and northeastern Adria upper crust layers.

We also observed slightly higher parameter correlations for the process-focused analysis. This is probably related to the fact that the model is sensitive to more parameters.

We observed that the weighted measurement-focused analysis tends to be closer to the process-focused analysis. This becomes understandable by looking at the applied weighting scheme. We applied a regional weighting scheme to compensate for the unequal data distribution in the four regions of our model. Hence, we can compensate partly for the measurement bias. However, we are not able to fully compensate for the data sparsity. The main reason for this is that we can compensate for fewer data points but not for regions without data points since no measurements are available to which we could apply a higher weight. This can be observed, for instance, in the properties related to the layers of the Molasse.

We observed that the sensitivity indices of the thermal properties related to the layers inside the Upper Rhine Graben are further apart for the weighted and process-focused comparison than for the non-weighted process-focused one. This is related to the choice of the weighting scheme. We chose to put less weight on the temperature data from the Upper Rhine Graben since we do not account for convective effects in this paper. Analogously, the properties of the Apennine upper crust layers also have a too small influence for the weighted scenario. As a reminder, we downgraded the importance of the temperature data in this region since the data consists of minimum temperature data.

Through the weighting we are able to compensate for the underestimation of the unconsolidated sediments of the Po Basin. Hence, the bias most likely induced by the high data density of the other layers can be reduced.

For the thermal conductivities of the Saxothuringian, Vosges, and Molasse upper crust (gray rectangle of Fig.

Note that the weighting scheme is case study and aim specific. Depending on our knowledge about data quality, regions of interest, and other aspects the weighting scheme can be designed in a case specific manner. In this paper, we do not aim to provide “the ideal” weighting scheme for the Alpine region. Instead, we demonstrate the impact of a weighting scheme for thermal modeling. In addition, note that due to the high impact of the weighting this also means that we need to carefully consider the weighting scheme. An incorrect weighting scheme will increase the bias.

So far, we have presented that we obtain significantly differing sensitivities for the process-focused and measurement-focused study. In the following, we demonstrate the consequences of this difference through a deterministic model calibration. We choose the example of a model calibration because this is a typical inverse process that relies on observation data.

Model calibration aims to compensate for existing model errors by adjusting the model parameters in accordance with our temperature measurements. Analogous to

Comparison of the initial thermal properties and the calibrated thermal properties for different geological models and different weighting schemes. The parameter that is not considered in the model calibration due to sensitivities that are too low is denoted with n/a.

In the following, we discuss the results of the automated model calibration and its consequences. Note that in this work we use the model calibration in a slightly different way. Usually, it is used to compensate for model errors. That means of course that it also identifies the problematic model areas. In this work, we employ the model calibration as an identification tool for model errors. Therefore, we use the calibrated values by

The first model problem that we can identify is the measurement bias through an unequal data distribution (General-Focus – unweighted). This can be at least partly removed through data weighting (General-Focus – weighted), yielding smaller differences between initial and calibrated values. Nonetheless, we observe a low radiogenic heat production in the upper crust, meaning that our model is non-ideal in the description of the upper crust. This also leads to thermal conductivities that are too low in the sediments and too high in the lithospheric mantle.

Therefore, we introduce a second model, the Crustal-Focus model. For this model, we obtain a good agreement for the upper crust but greater discrepancies in unconsolidated sediments (below 1 km) and the lithospheric mantle. Hence, we can remove the error in the upper crust but at the same time introduce new error sources.

For the calibration of the unweighted General-Focus model, we achieve a

Note that we do not aim to present the “optimal” model in this paper. Instead, we want to demonstrate various components that influence the model. Generating an optimal model is not possible since all models are per definition wrong

We have discussed the consequences of the model change for the calibrated thermal conductivities. Now we want to briefly discuss the consequences for the sensitivities. Therefore, we repeat the process-focused and measurement-focused sensitivity analysis for the Crustal-Focused model. Note that we consider only the weighted scenario (branch 2.1.1 and 2.2.1 of Fig.

For the Crustal-Focus model, we thinned the upper crust. This can be clearly observed in the decreased sensitivities of the model to the upper crust layers (red box of Fig.

Comparison of the sensitivities of the process-focused study for both the General-Focus and Crustal-Focus Alps model. The solid black line denotes the threshold value for determining if the parameters are influencing the model response.

Comparison of the sensitivities of the measurement-focused study for both the General-Focus and Crustal-Focus Alps model. The solid black line denotes the threshold value for determining if the parameters are influencing the model response.

The radiogenic heat production of most of the lower crust is more influential for the Crustal-Focused model as the upper crust was thinned by thickening the lower crust. The only exception is the Saxothuringian lower crust (

Furthermore, we observe a higher influence of the unconsolidated sediments in the Upper Rhine Graben (gray box of Fig.

The model change is observable in both the model calibration for the thermal properties and the corresponding sensitivities. However, if we look at the gravity residuals (Fig.

Gravity residual of

In this paper, we have seen that the measurements induced a significant bias. This opens the discussion of subsequent projects. Therefore, we would like to investigate how we can decrease this bias by incorporating further data sources that give us only an indirect measure of the temperature. Furthermore, it would be interesting to further explore the field of joint inversion to incorporate various geophysical data sources already used during model construction.

In this paper, we have demonstrated the bias that a measurement-focused study can cause. This bias can be partly removed through automated and customized data-weighting schemes. However, as is typical for geoscientific applications, many areas of the model do not have any associated data. Unfortunately, it is not possible to compensate for the bias arising from these areas. This shows the importance of focusing on regions where data are present whenever possible.

However, many inverse processes such as deterministic and stochastic model calibrations are dependent on measurement data. In this case the bias is unavoidable. Nonetheless, we need to be aware of which kind of bias we are introducing through this procedure to take the effects for all further analyses into account. We need to be aware that the data are often only informative towards the shallower layers. Hence, we lose the information about deeper layers and at the same time overestimate the influence of the shallower layers. This also means that we are unable to calibrate and validate the lower parts of our geological models. Nonetheless, these regions are important to avoid influences from, for instance, the lower boundary condition.

We have also seen the importance of considering various data sources. The changes from the General-Focus to the Crustal-Focus model were only visible in the thermal studies but not in the gravity residuals.

Note that although we performed the analyses for the case study of the Alps, these aspects hold in general since the data distribution shown here is typical for geoscientific applications.

Symbols and layer IDs for both the General-Focus model and Crustal-Focus model.

Continued.

For the construction of the reduced models, we used the software package DwarfElephant (

The structural model 3D-ALPS constrained in

DD was responsible for the conceptualization, methodology, and software and writing the original draft. CS was responsible for the data curation and for writing, reviewing, and editing the manuscript. MSW and MC were responsible for the reviewing of the concepts and the reviewing and editing of the manuscript.

The authors declare that they have no conflict of interest.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors gratefully acknowledge the Earth System Modelling Project (ESM) for funding this work by providing computing time on the ESM partition of the supercomputer JUWELS

This research has been supported by the ESM consortium.The article processing charges for this open-access publication were covered by the Helmholtz Centre Potsdam – GFZ German Research Centre for Geosciences.

This paper was edited by Rohitash Chandra and reviewed by two anonymous referees.