Sub3DNet1.0: a deep-learning model for regional-scale 3D subsurface structure mapping

Abstract. This study introduces an efficient deep-learning model based on convolutional
neural networks with joint autoencoder and adversarial structures for 3D
subsurface mapping from 2D surface observations. The method was applied to
delineate paleovalleys in an Australian desert landscape. The neural network
was trained on a 6400 km2 domain by using a land surface topography
as 2D input and an airborne electromagnetic (AEM)-derived probability map of
paleovalley presence as 3D output. The trained neural network has a squared
error <0.10 across 99 % of the training domain and produces a
squared error <0.10 across 93 % of the validation domain,
demonstrating that it is reliable in reconstructing 3D paleovalley patterns
beyond the training area. Due to its generic structure, the neural network
structure designed in this study and the training algorithm have broad
application potential to construct 3D geological features (e.g., ore bodies,
aquifer) from 2D land surface observations.



Neural network structure optimization
The neural network involves multiple hyperparameters, e.g. network depth, width, and filter size, to define the network structure. It is almost impossible to obtain a best combination of these hyperparameters in a way similar to optimising physical-based process models. The hyperparameters are selected in this work by a large amount of try-and-error tests, where one hyperparameter is changed among three to five possible values, while the other 5 hyperparameters are fixed. For each calculation, the performance of the neural network is expressed by (1) computation time cost, and (2) peak-signal-to-noise ratio (PSNR) between the generated images of 3D palaeovalley aquifer index (PAI) and the real image in both training and validation areas (80 km west to the training area), expressed as (Wang and Bovik, 2002): where M is the number of voxels in the 3D image, ( ) represents the PAI generated by the neural network, y is 10 the PAI calculated from AEM-derived electrical conductivity, which is considered as the ground truth. PSNR is a traditional approach to image quality assessment. A high PSNR represents a high-quality PAI generation. A neural network structure that can result in high PSNR with low computation time cost is desirable.
Following the test results, it was found that four hyperparameters, including output layer size of the encoder, filter size, width, and the loss function of generator, affect the performance of the neural network most significantly.

15
The details are given below.

Output layer size of encoder
As shown in Fig. S1a, when a large output layer size of 50×50×10 is employed to the encoder, a low PSNR is

Filter size
Filter size controls the spatial correlation scale able to be addressed by the neural network, which is one of the most important hyperparameters. As demonstrated in Fig. S3, the PSNR of generated PAI increases with the filter size in the decoder, and stabilizes at 12 to 14 when the filter size becomes larger than 4. As the number of weights 45 to be optimized in the neural network increases with the filter size, the computation time increases significantly with the filter size. We here select a filter size of 5 (in the horizontal directions) in the decoder, which can result in a high PSNR in both training and validation areas within an acceptable computation time (~30s per calculation).

Width
Increases of width and depth of the neural network can increase the number of weights, and thus enhance the capability of the network in capturing complex nonlinear relationships between input and output images. However, too many weights to train in the neural network with a limited number of training datasets can lead to a high risk

55
of overfitting. As demonstrated in Fig. S4, as the decoder width increases, the resulting PSNR in both training and validation areas increases. It is also noted that when the maximum width in the decoder is given as 256, PSNR in the training area varies between 12 to 14, because the stochastic weight updating by adaptive moment estimation algorithm (ADAM) (Kingma and Ba, 2014) can result in different weight sets, even under the same network structure. But the PSNR in the validation area concentrates at 12, without fluctuation. This indicates that the 60 generated PAI is insensitive to the variation of the weights in the neural network, and a large number of waste weights are contained. Thus a maximum width of 128 is preferred in this study, which allows the neural network to generate the 3D PAI with relative high accuracy and low computation time.

Loss function
One advantage of the network employed in this study is the ability to update the weights in an adversarial way, based on both the voxel-wise independent criterion (mean square error) and the adversarial criterion. By adjusting 4 the coefficients in the loss function of generator in Eq. 6, the PSNR in training and validation areas are calculated 70 in Fig. S5. As shown, as the coefficient b of voxel-wise independent criterion increases from 0 to 0.1, the performance of the neural work enhances. The PSNR of the trained neural working to generate the PAI increases from 12 to 14 in the training area and from 12 to 12.5 in the validation area. Further increases in the coefficient of the voxel-wise independent criterion to 0.5 can increase the PSNR in the training area, but leads to a decrease in PSNR in the validation area. This indicates that the overfitting problem occurs. We here use the coefficient of 75 voxel-wide independent criterion of 0.1 and that of adversarial criterion of 0.9. To covert 2D input image to 3D PAI images with 10 layers, 2D input image were first repeated at 10 layers before convolution, and then 3D filters were employed for convolution and deconvolution.

Figure S5. (a) PSNR of 3D PAI in training and validation areas (80 km west to the training area) generated by a neural network with different loss functions for updating the weights in generator, and (b) the run cost for each calculation.
Leaky Relu: ( ) = max ( , 0.2 ); Sigmoid: ( ) = 1 1+ − , and filter and strike expressed by e.g. 4×4×2 representing the value in eastward, northward and vertical direction, respectively.
Coefficients in generator loss function (Eq. 6) are 0.9, 0.1 and 5, respectively. Moreover, increasing the coefficient a of Kullback-Leibler divergence from 1 to 100 can give priority to weights updating in the encoder. The encoder is used to fuse the information of input images to a low-dimension code following the standard normal distribution, while the decoder reconstructs the 3D PAI from these low-dimension codes. When a large coefficient of a = 100 is applied, the network may only update the weights in the encoder, 85 rather than in the decoder. Thus, the capability of the trained network in PAI generation is weak, with a low PSNR in both training and validation areas.
Following the try-and-error tests above, we here select a set of hyperparameters (Table S1) that allow the neural network to produce an acceptable 3D PAI (with the PNSR in training area reaching 16, in validation area 14, and the computation time is less than 30s per calculation). However, these hyperparameters are not necessarily the 90 theoretically best ones.

Neural network training
We monitor the loss functions when training the neural network to verify that the network is being adequately trained. For the generative adversarial training, there is no critical standard values required for the loss functions of the generator and discriminator. Nevertheless, it is reasonable that during the training processes, the 95 discriminator and generator loss functions vary in an adversarial way, and both stabilize at constant values, indicating the network has been adequately trained. As shown in Fig. S6, the generator loss function decreases from about 1.5 to 0.5, while the discriminator loss increases from 0.5 to 1.5. Both stabilize after 5800 iterative calculations. The decrease of generator loss corresponds to the increase of discriminator loss, suggesting that the network was adequately trained.