Recognizing spatial geochemical anomaly patterns using deformable convolutional networks guided with geological knowledge

Zhang, Xinyu; Xiong, Yihui; Chen, Zhiyi

doi:10.5194/gmd-19-2219-2026

Articles | Volume 19, issue 5

https://doi.org/10.5194/gmd-19-2219-2026

Articles | Volume 19, issue 5

Development and technical paper

17 Mar 2026

Development and technical paper |

| 17 Mar 2026

Recognizing spatial geochemical anomaly patterns using deformable convolutional networks guided with geological knowledge

Xinyu Zhang, Yihui Xiong, and Zhiyi Chen

Abstract

This study tackles the limited quantification of irregular spatial geochemical anomaly patterns and weak interpretability in deep learning models in geochemical anomaly recognition. We propose a hybrid approach that integrates geological knowledge (GK) into deformable convolutional networks (DCN), creating a model termed GK-DCN, with the aim of enhancing both the performance and transparency of geochemical anomaly recognition. This model introduces learnable parameters that allow the convolution kernel to adaptively adjust its shape according to the characteristics of its sampling position, enabling them to more accurately capture complex and irregular geochemical anomaly patterns caused by mineralization. To enhance geological consistency, ore-controlling faults are incorporated as geological knowledge constraint, guiding the network to prioritize spatial correlations between deposits and faults. Experimental results in southern Tianshan Au-Cu polymetallic ore district demonstrate that the GK-DCN, verified across multiple evaluation metrics, significantly enhances the accuracy and reliability of geochemical anomaly recognition, Besides, it produces more distinct spatial anomalous patterns and higher consistency with known mineral deposits by adaptively adjusting the receptive field. Visualization of the kernel offsets revealed the model's superior adaptive spatial sampling mechanism. Furthermore, feature significance heatmaps generated by Grad-CAM (Gradient-weighted Class Activation Mapping) highlighted the key features that the model focused during geochemical anomaly recognition. These visualizations significantly improve the interpretability and prove the effectiveness in capturing complex geochemical patterns. This work provides an effective intelligent method for spatial geochemical pattern recognition and offers a reference for interpretable deep learning in geochemical exploration through multi-angle visualization.

Download & links

Article (PDF, 6497 KB)

Download & links

How to cite.

Received: 02 Oct 2025 – Discussion started: 15 Oct 2025 – Revised: 08 Feb 2026 – Accepted: 03 Mar 2026 – Published: 17 Mar 2026

1 Introduction

Geo-anomalies, detected through various observational datasets such as geological, geochemical, geophysical, and remote sensing methods, play a vital role in identifying mineralization-related geological processes. Their significance lies in the fact that these anomalies reveal underlying causative features or events that are not directly observable (Cheng and Zhao, 2011). Hydrothermal mineralization is a systematic yet complex geological phenomenon, involving the migration of ore-bearing hydrothermal fluids, interactions between fluids and host rocks, mineral precipitation, and the eventual concentration of economic minerals (Pirajno, 2008). Since these processes result from the interplay of multiple geological factors operating across different spatial and temporal scales, the associated geo-anomalies display considerable complexity (Cheng, 2012). Analyzing their spatial distribution can significantly refine and advance geological understanding of numerous scientific questions. Geochemical anomalies associated with mineralization represent one of the most significant types of geo-anomalies for mineral exploration (Zuo et al., 2021). These anomalies often exhibit anisotropic spatial distributions that are controlled by ore-forming geological structures (e.g., strata, faults, and magmatic intrusions), which provide essential space, heat, fluid, and material conditions for mineralization (Pirajno, 2008). For example, hydrothermal mineralization frequently presents as linear-trending geochemical anomalies along fault zones, where fault systems act as pathways for the transport and deposition of ore-forming materials (Wang et al., 2013). Consequently, recognizing the spatial anisotropy of geochemical anomaly patterns is crucial for accurately identifying significant anomalies, thus can greatly enhance the success rate of mineral exploration (Cheng, 2012; Zuo, 2017; Xiao et al., 2018).

Long-term research and practice have demonstrated that integrating spatial structure through spatial autocorrelation analysis, spatial decomposition, moving window statistics, and spatially aware machine learning offers a more geologically realistic and robust framework for recognizing geochemical anomalies. Spatial autocorrelation analysis, using methods like local Moran's I, helps detect statistically significant spatial clusters-such as high-high clusters (indicating potential anomalies surrounded by other high values) and high-low clusters (representing isolated high values) (Yin et al., 2021). Spatial decomposition primarily employs the following methods: Kriging, Trend Surface Analysis and Multifractal Filtering. Kriging, allow for the estimation of values at unsampled locations, generating a spatially continuous model of the geochemical background. Anomalies are identified where measured values significantly exceed the Kriging predictions, indicated by large prediction errors (Jimenez-Espinosa et al., 1993). Additionally, incorporating directional variograms into Kriging methods, such as anisotropic ordinary Kriging, enables explicit accounting for directional trends and local heterogeneity in anomalies (Reis et al., 2003). Trend surface analysis involves fitting polynomial surfaces, using either global or local regression, to model regional trends. The residuals derived from this surface represent local deviations, which can serve to highlight anomalies against the broader regional pattern (Wang and Zuo, 2015). Multifractal filtering methods mainly include the Concentration-Area (C-A) model and the Spectrum-Area (S-A) model (Cheng et al., 1994, 2000). These methods plot element concentration against area and identify breaks in the observed power-law (scaling) behavior. These breakpoints are used to separate background populations from anomalous ones. This approach explicitly models the scale-dependent heterogeneity of spatial patterns and establishes thresholds based on deviations from fractal behavior across different scales (Cheng, 2012). Moving window statistics methods, such as local singularity analysis and the local gap statistic, calculate local statistics (e.g., mean, median, standard deviation) within a defined spatial window (Cheng, 2007; Wang and Zuo, 2016). Values that significantly exceed the local background within their respective window are identified as geochemical anomalies. This technique effectively captures local spatial context and non-stationarity, although the choice of window size is a critical and subjective step. To account for anisotropy, these methods are further modified by incorporating elliptical or directionally weighted windows (Xiao et al., 2018, 2020; Wang et al., 2018). The final category for analyzing geochemical spatial patterns is spatially aware machine learning. This approach primarily includes two types: models that integrate spatial features or components (Cheng et al., 2011; Wang et al., 2015), and models with inherent capabilities to capture spatial structures (LeCun and Bengio, 1998). In the first type, spatial characteristics are incorporated into traditional statistical methods through distance-based kernels or spatial weighting schemes to address geographic heterogeneity and non-stationarity. Commonly used spatially weighted machine learning techniques for identifying geochemical anomaly patterns include geographically weighted regression (GWR) (Wang et al., 2015; Tian et al., 2018), spatially weighted principal component analysis (SWPCA) (Cheng et al., 2011; Xiao et al., 2012), density-based spatial clustering of applications with noise (DBSCAN) (Zhang et al., 2019; Hajihosseinlou et al., 2024), and geographical random forest (GRF) (Soltani et al., 2024). The second type involves machine learning architectures specifically designed to handle spatial data, such as convolutional neural networks (CNN) (LeCun and Bengio, 1998) and graph neural networks (GNN) (Scarselli et al., 2008). In the application of deep learning in spatial geochemical anomaly recognition, various advanced deep learning models have been proposed, which mainly involves: graph-based model, Transformer, spatial-spectrum dual-branch model and informed model (Xu et al., 2025a). These methods are capable of characterizing complex spatial relationships effectively in geochemical survey data (Zhang et al., 2021; Xu et al., 2023, 2024b, 2025a; Liang et al., 2025; Chen and Zuo, 2025); modelling both spatial and spectrum characteristics simultaneously while accounting for inherent heterogeneity of the data (Zuo and Xu, 2024; Xu and Zuo, 2025; Xu et al., 2025b); and considering geological knowledge to enhance its interpretability (Xiong et al., 2022; Luo et al., 2023; Zuo et al., 2024, 2025; Xu et al., 2025a). CNN and GNN are mostly adopted as the basis framework in these models. CNN learns local spatial features and mineralization-related patterns through convolutional and pooling operations (Chen et al., 2019; Zhang et al., 2021; Huang et al., 2022; Yang et al., 2023). However, a key limitation is their reliance on fixed, regular convolution kernels (e.g., 3×3 grids), which restricts their ability to adequately model the anisotropic nature of geochemical patterns (Dai et al., 2017). In contrast, GNN directly represents non-Euclidean spatial relationships using nodes (e.g., sample points with geochemical attributes) and edges (encoding spatial proximity or geological links), allowing anomaly recognition based on complex neighborhood interactions (Xu et al., 2023, 2024b, 2025a; Liang et al., 2025; Chen and Zuo, 2025). Nevertheless, GNN requires high-quality data and substantial domain knowledge to define meaningful graph structures. In particular, defining appropriate edges based on spatial distance or geological similarity is crucial yet challenging. Improper edge definitions may introduce noise, mask genuine anomalies, and ultimately impair model performance (Gong and Cheng, 2019; Zhou et al., 2020).

Deformable Convolutional Networks (DCN) address a fundamental constraint of traditional CNN: the fixed geometric structure of their convolution kernels (Dai et al., 2017). By introducing learnable spatial offsets for each sampling point in the kernel, DCN adaptively adjust the sampling locations, effectively warping the kernel's receptive field to align with irregular and complex patterns (Dai et al., 2017; Zhu et al., 2018). This flexibility allows the kernel to conform non-rigid and deformed structures, enabling more precise feature extraction from key regions of irregular shapes (Dai et al., 2017; Zhu et al., 2019). As a result, DCN exhibit greater robustness to geometric variations such as changes in orientation, scale, or deformation, maintaining consistent feature representation across diverse pattern states. These capabilities make DCN valuable in tasks involving irregular spatial structures. By offering essential spatial adaptability, DCN provide a powerful tool for analyzing the complex and often messy geometries encountered in various domains, which involve irregular seismic data interpolation (Zhao et al., 2023; Luo et al., 2024; Sun et al., 2024), earthquake crack detection (Yu et al., 2022), flood boundary detection (Yu et al., 2023), surface wave suppression (Gao et al., 2024), underwater image enhancement (Tian et al., 2023), atmospheric forecasting (Nielsen et al., 2022), precipitation forecasting (Xu et al., 2024a), morphological characteristics of clouds modelling (Liu et al., 2021), images denoising (Guan et al., 2022; Liu et al., 2024), hyperspectral image classification (Zhu et al., 2018; Zhao et al., 2021), identification of anomalous deformation areas (Zhang et al., 2022), hyperspectral anomaly detection (Wu et al., 2023), and soil moisture monitoring (Na et al., 2025). By capturing spatial deformations, DCN offer a transformative approach for extracting meaningful metrics from the inherent irregularity of geoscientific data.

In this study, we utilize a DCN as the foundational model for recognizing and extracting complex anisotropic geochemical spatial patterns. Just as purely data-driven deep learning methods such as CNN face interpretability issues, so does the DCN, whose function is regarded as complex “black box”. While they achieve high prediction accuracy, understanding why they make a specific prediction, which features in the input data were decisive, or how their learned representations map to established geological concepts is extremely difficult (Gilpin et al., 2018; Rudin, 2019). Current approaches to enhance the interpretability of deep learning models primarily operate at three stages: model input, model construction, and model output (Zuo et al., 2024). At the model input stage, interpretability is enhanced through metallogenic models, feature engineering, and geologically constrained data augmentation methods (Zuo et al., 2024). At the model construction stage, key ore-controlling factors are integrated into the hidden layers, while the spatial coupling relationship between known mineral deposit locations and these factors is incorporated into the loss function (Xiong et al., 2022; Luo et al., 2023; Zuo et al., 2025). At the model output stage, visualization techniques are employed to examine the outputs of each hidden layer, providing insight into the extraction and integration processes of prospecting information. Meanwhile, attribution techniques are applied to assess the importance of input variables, helping to quantify their contributions to the formation of mineral deposits (Luo et al., 2023; Xu et al., 2025a). In this study, we enhance the interpretability of the DCN at both the model construction and output stages. During model construction, a governing equation representing the spatial correlation between known mineral deposits and ore-controlling factors is embedded into the loss function (Xiong et al., 2022; Zuo et al., 2024). This approach introduces conceptual models and expert knowledge into the training process, ensuring that the model's results are consistent with established geological principles (Zuo et al., 2024). At the model output stage, we utilize class activation mapping (CAM) (Jung and Oh, 2021) and its variant, Grad-CAM (Selvaraju et al., 2016), to visualize the regions within the input data that most influence the model's predictions. CAM visually identifies the most discriminative regions in an input image responsible for a specific class prediction of CNN and DCN. It leverages the weights of the final fully connected layer to compute a weighted sum of the activation maps from the last convolutional layer, thus can transform CNN and DCN from a “black box” into a more transparent model by generating a heatmap (class activation map). Besides, the learned offsets are also visualized to reveal how DCN dynamically adapts sampling locations, enhancing the understanding of model behavior for spatial pattern quantification. Ultimately, the constructed model was applied to the study area of the southern Tianshan Au-Cu polymetallic ore district to verify its effectiveness and interpretability in identifying geochemical anomalies.

2 Geological setting and Datasets

2.1 Geological setting

The South Tianshan Metallogenic Belt, extending across Central Asia from Uzbekistan through Tajikistan, Kyrgyzstan, and into western China (Xinjiang), is one of the world's most significant gold and copper provinces (Fig. 1). Its formation is intrinsically linked to the protracted and complex tectonic history of the Central Asian Orogenic Belt, specifically the final closure of the Paleo-Asian Ocean (Gao et al., 2009; Han et al., 2011). The regional geology is dominated by the collage of multiple terranes, including Precambrian continental blocks, early Paleozoic oceanic crust fragments, and island arcs, which were accreted and subsequently deformed during the Late Paleozoic collision between the Tarim Craton to the south and the Kazakhstan-Yili Block to the north (Gao et al., 2009). This continental collision, culminating in the Late Carboniferous to Early Permian, created a major suture zone characterized by extensive thrusting, folding, and large-scale strike-slip fault systems. These structures provided crucial conduits for subsequent fluid migration and mineralization.

The primary mineralization events are temporally and genetically associated with this collisional orogeny and the post-collisional extensional phase. Two major mineralization styles prevail: (1) Orogenic gold deposits, often hosted in shear zones within Neoproterozoic to Paleozoic metamorphic rocks (e.g., the giant Muruntau deposit in Uzbekistan). These deposits formed from metamorphic fluids released during devolatilization of subducted slabs or thickened crust. (2) Copper-gold skarn and porphyry-style mineralization, frequently associated with Late Carboniferous to Permian post-collisional I-type granitoids intruding carbonate-rich sequences. These intrusions provided the heat and magmatic fluids responsible for widespread hydrothermal alteration and metal deposition. The conjunction of fertile source rocks (often black shales), ideal structural traps (fault jogs, shear zones, lithological contacts), and the timing of magmatism relative to tectonic stress changes created the perfect conditions for the formation of world-class gold and copper deposits. The Chinese segment of the South Tianshan, such as the Sawayaerdun gold belt, continues this metallogenic trend, hosting numerous deposits with similar genetic models (Chen et al., 2012; Goldfarb et al., 2014; Seltmann et al., 2014).

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f01

Figure 1Simplified geological map of the Southern Tianshan showing the main tectonic units and Au-Cu deposits (modified from Xue et al., 2014; Zhao et al., 2020).

2.2 Datasets

The 1:200 000 scale geochemical samples in this study area were sourced from the Chinese National Geochemical Mapping Project (Xie et al., 1997; Wang et al., 2007). The standard sampling density was 1–2 samples per square kilometer, with every 4 km² constituting one analytical unit. Sampling density was appropriately reduced in areas where fieldwork was difficult to conduct (1 sample per 20–50 km²). Multiple sub-samples were collected within a certain range (20–50 m) around the sampling point and combined into a single composite sample. The sample was sieved through a 60-mesh stainless steel screen, with the final sample weight exceeding 200 g. A total of 32 elements and 7 oxides were analyzed: Bi, Cu, P, La, Li, Ag, Sn, Au, Mo, Th, U, Y, W, Sb, Hg, Mn, Cr, Sr, Nb, Pb, Ni, Ti, Cd, Co, Ba, Be, V, Zn, B, As, Zr, F, as well as Fe₂O₃, K₂O, CaO, MgO, Na₂O, Al₂O₃, and SiO₂. The detection limits and analytical methods for each element are listed in Table 1.

Table 1Elements, analytical methods, and detection limits from the Chinese national geochemical mapping project.

Note: XRF: X-ray fluorescence spectrometry; ICP-AES: Inductively coupled plasma-atomic emission spectrometry; ICP–MS: Inductively coupled plasma–mass spectrometry; ES: Emission spectrometry; HG-AFS: Hydride generation atomic fluorescence spectrometry; GF-AAS: Graphite furnace atomic absorption spectrometry; CV-AFS: Cold vapor atomic fluorescence spectroscopy; ISE: Ion selective electrode.

Download Print Version | Download XLSX

3 Methods

3.1 Deformable convolutional networks (DCN)

Deformable convolution networks (Dai et al., 2017; Zhu et al., 2019), enables adaptive adjustment of the receptive field positions by incorporating learnable offset parameters for each sampling point within the convolutional kernel. Figure 2 illustrates the distinction between the sampling points of standard convolution and those of deformable convolution. This approach overcomes the limitations imposed by a fixed grid structure, thereby facilitating more flexible and precise extraction of image features exhibiting complex geometric deformations.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f02

Figure 2Illustration of the sampling locations in 3×3 standard and deformable convolutions. (a) Regular sampling grid of standard convolution; (b–d) deformed sampling locations of deformable convolution with augmented offsets. The red areas are the sampling locations in 3×3 standard convolution. The grey areas and the blue areas are the initial sampling locations and final sampling locations of the deformable convolution, respectively. The yellow arrow points from the initial sampling location to the corresponding final sampling location.

Download

The computation involved in deformable convolution remains a form of two-dimensional convolution, with an emphasis on spatial interactions across all channels. The fundamental aspect of this method lies in learning the offsets of sampling points via a parallel branch network, allowing the convolutional kernel to dynamically adjust its sampling locations based on the content of the input feature map. This mechanism directs convolutional operations to concentrate on regions of interest, substantially enhancing the network's capacity to represent features associated with geometric transformations.

In this study, a standard 3×3 two-dimensional convolutional kernel, denoted as R, is employed as an illustrative example.

\begin{matrix} (1) & R = (- 1, - 1), (- 1, 0), \dots, (0, 1) (1, 1) \end{matrix}

In conventional convolutional kernels, the weight matrix is denoted by w, the input feature map by x, and p_n represents any pixel within the convolutional window R. For each output position p₀ in the feature map, the convolution operation can be mathematically expressed as follows:

\begin{matrix} (2) & y (p_{0}) = \sum_{p_{n} \in R} w (p_{n}) x (p_{0} + p_{n}) \end{matrix}

In the context of deformable convolution, the introduction of an offset $Δ p_{n} Δ p_{n} ∣ n = 1, \dots, N N = | R |$ modifies the original formulation, transforming it into:

\begin{matrix} (3) & y (p_{0}) = \sum_{p_{n} \in R} w (p_{n}) \times (p_{0} + p_{n} + Δ p_{n}) \end{matrix}

This adjustment results in sampling points that are spatially shifted, with the offset positions denoted as p_n+Δp_n. Since the offset Δp_n generally assumes non-integer values, the computation of the convolution must be performed using bilinear interpolation, as described by:

\begin{matrix} (4) & x (p) = \sum G (q, p) x (q) \end{matrix}

The value at any position p is thus a function of $p = p_{0} + p_{n} + Δ p_{n}$ and is computed over all spatial locations q within the input feature map x by employing the bilinear interpolation kernel $G (\cdot, \cdot)$ . Notably, the two-dimensional interpolation kernel is separable and can be decomposed into the product of two one-dimensional kernels, which serves to optimize computational efficiency:

\begin{matrix} (5) & G (q, p) = g (q_{x}, p_{x}) g (q_{y}, p_{y}) \end{matrix}

where $g (a, b) = max (0, 1 - | a - b |)$ .

Figure 3 delineates the detailed implementation procedure of deformable convolutional layers. Initially, the learned offset vectors are applied to the fixed sampling grid of the input feature map, enabling adaptive adjustment of each sampling point's position. Subsequently, bilinear interpolation is utilized to estimate feature values at the offset, non-integer coordinate locations, thereby ensuring that the sampled feature distribution effectively concentrates on the target region. Figure 4 provides a comparative visualization between standard convolution and deformable convolution with respect to their receptive fields for geochemical pattern recognition. By incorporating offsets, the receptive field in deformable convolution transcends the constraints imposed by the fixed, regular grid of standard convolution. This flexibility allows the receptive field to adaptively assume irregular spatial configurations that better correspond to the actual geometric structure of the target object, thereby substantially enhancing the accuracy of feature extraction.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f03

Figure 3The framework of proposed geological knowledge guided deformable convolution networks.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f04

Figure 4Illustration of the sampling locations for (a) normal convolution and (b) deformable convolution. Maps showing irregular geochemical patterns. It is observed that deformable convolutions can adaptively extracts the features of the input by adjusting its shape according to the actual patterns by shifting the convolutional kernel, but normal convolutions only describe the fixed receptive field.

Download

3.2 Geologically-constrained DCN

This study introduces soft constraint on deformable convolutional networks to enhance geochemical anomaly detection by incorporating geological prior knowledge. In typical geochemical anomaly recognition tasks, deformable convolutional neural networks optimize their parameters by minimizing the cross-entropy loss, which measures the divergence between predicted and true label distributions. To improve this optimization process, the present work augments the loss function with an additional penalty term derived from established geological principles, thereby guiding the model to learn feature representations that better conform to geological laws (Semenov et al., 2019).

The conventional loss function L_DCN, is defined as follows:

\begin{matrix} (6) & L_{DCN} (p, \hat{p}) = - \sum_{x} p (x) \log \hat{p} (x) \end{matrix}

where p(x) and $\hat{p} (x)$ denote the true and predicted distributions, respectively. Building upon this, a novel penalty term grounded in geological knowledge is formulated and integrated into the loss function. Following the approach proposed by Zuo (2016) for penalty term construction, the relationship between the distance control factor and the spatial distribution of mineral deposits is modeled by a power-law function w, expressed as:

\begin{matrix} (7) & w = \frac{m}{m_{max}} = \frac{N d^{k}}{m_{max}} \end{matrix}

where N is a constant, d represents the distance between the control factor and the mineral deposit, m denotes the density of mineral deposits at distance d, and k corresponds to the line fitting parameters relating log ^m and log ^d. w serves as a control equation embedding prior geological knowledge to characterize the spatial coupling between known mineral occurrences and their controlling factors. As the DCN progressively learns the spatial distribution patterns between mineral deposits and their surrounding grid units, it becomes essential to extract spatial structural features encapsulated by the weight function w and incorporate them into the training process. Consequently, a geology-informed penalty term L_geology is constructed, formulated as:

\begin{matrix} (8) & L_{geology} = | | f_{softmax} (\hat{p} (x)) - f_{sigmoid} (\sum_{i = 1}^{n} a w_{i} + b) | |_{2}, \end{matrix}

where a and b are trainable parameters within the combined kernel used for feature aggregation, with a representing weights and b denoting bias terms; n is the number of feature maps. The aggregated features undergo normalization via the function f_sigmod, and the network output is subsequently transformed into mineral potential prediction values through the mapping function f_softmax.

Finally, a comprehensive loss function L_total is developed for the deformable convolutional network, integrating prior geological knowledge, and is expressed as:

\begin{matrix} (9) & L_{total} = L_{DCN} + λ L_{geology} . \end{matrix}

This formulation effectively constrains the model to produce predictions that are consistent with both data-driven learning and established geological understanding (Fig. 3). The regularization parameter λ has a crucial effect on deep learning training and prediction performance owing to its role in controlling the balance between the data loss and geological constraint loss. A larger value of λ leads to a higher training loss at the expense of the geological penalty term, and vice versa. Specifically, a larger value of λ gives the deep learning model a better chance of learning geological-related patterns, while a smaller value of λ gives the deep learning model a better chance of learning general patterns from data. Thus, the regularization parameter λ should be carefully chosen (Xiong et al., 2022). To balance the data loss and the geological constraint loss, the parameter is set to 1.

3.3 Gradient-weighted Class Activation Mapping

To improve the interpretability and discriminative localization capabilities of deformable convolutional neural networks, the integration of class activation mapping (CAM) techniques is employed. The conventional CAM approach leverages the weights from the global average pooling (GAP) layer and the final classification layer to visualize the discriminative regions utilized by the CNN during classification. By projecting the output layer's weights back onto the convolutional feature maps, the relative importance of different image regions is identified.

Initially, it is necessary to remove all fully connected layers following the last convolutional block, as CAM requires a fully convolutional architecture to maintain spatial information up to the final layer. A GAP layer is introduced subsequently to the last deformable convolutional layer to substitute the fully connected layers (Jung and Oh, 2021). The function of this GAP layer is to compute the spatial average value F^k of each feature map in the final convolutional layer, which can be mathematically expressed as:

\begin{matrix} (10) & F_{k} = \frac{1}{X \cdot Y} \sum_{x = 1}^{X} \sum_{y = 1}^{Y} f_{k} (x, y), \end{matrix}

where f_k(x,y) denotes the activation at spatial location (x,y) in the kth channel of the feature map output by the last deformable convolutional layer, and X and Y represent the width and height of the feature map, respectively.

Following the GAP layer, a single fully connected layer with a softmax activation function is appended. For a given class c, this layer assigns a weight $w_{k}^{c}$ to each averaged feature map value f_k(x,y). The linear classification logit score S_c for class c is then computed as:

\begin{matrix} (11) & S_{c} = \sum_{k} w_{k}^{c} F_{k} = \frac{1}{X \cdot Y} \sum_{x, y} \sum_{k} w_{k}^{c} f_{k} (x, y), \end{matrix}

where S_c is a scalar representing the classification score. To generate the class activation map, the weights $w_{k}^{c}$ are multiplied element-wise with the corresponding feature maps F^k and summed across all channels:

\begin{matrix} (12) & M_{c} (x, y) = \sum_{k} w_{k}^{c} f_{k} (x, y) . \end{matrix}

This operation preserves spatial information along the width and height dimensions. Subsequently, bilinear interpolation is applied to upsample the matrix M_c to the original input image size, thereby producing the complete CAM visualization.

In summary, each feature map channel corresponds to a specific class of visual features extracted by a convolutional kernel from the input image. The weights $w_{k}^{c}$ implicitly indicate the significance of these features for the classification of category c, reflecting the degree of attention that the model allocates to each feature with respect to that class.

However, CAM technique necessitates the substitution of the fully connected layer with a GAP layer and is limited to analyse only the final convolutional layer. To overcome these constraints, we adopted the Gradient-weighted CAM (Grad-CAM) approach, which derives the requisite weights indirectly through gradient computations rather than depending on the GAP layer and softmax activation (Selvaraju et al., 2016). Consequently, it enables the generation of class-specific activation heatmaps for convolutional layers situated at various depths within the network.

The Grad-CAM algorithm computes the gradient of the target score, which typically correspond to the class of interest, with respect to the feature maps of a selected convolutional layer. From these gradients, the importance weight $α_{k}^{c}$ for each channel k is obtained, as expressed by the following equation:

\begin{matrix} (13) & α_{k}^{c} = \frac{1}{Z} \sum_{i = 1}^{u} \sum_{j = 1}^{v} \frac{\partial y^{c}}{\partial A_{i j}^{k}}, \end{matrix}

where c denotes the target class, $α_{k}^{c}$ represents the weight of the kth channel for class c, and y^c is the linear classification logit score for class c. The partial derivative $\frac{\partial y^{c}}{\partial A_{i j}^{k}}$ corresponds to the sensitivity of the output score y^c with respect to the activation at spatial location (i,j) in the kth feature map. u and v indicate the width and height of the feature map, respectively.

Mathematically, the weight $α_{k}^{c}$ serves a role analogous to the weight $w_{k}^{c}$ in the original CAM formulation. By linearly combining these weights with the corresponding feature maps, the class activation map M_c can be computed as follows:

\begin{matrix} (14) & L_{Grad-CAM}^{c} = ReLU (\sum_{k} α_{k}^{c} A^{k}) . \end{matrix}

The application of the ReLU function ensures that only features exerting a positive influence on the class c are retained. Finally, the resulting heatmap “M_c” is upsampled to match the input image dimensions using bilinear interpolation, thereby facilitating effective visualization of the class-discriminative regions (Fig. 5).

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f05

Figure 5The workflow diagram for obtaining Grad-CAM within convolution neural network and deformable convolution networks.

Download

4 Results and Discussions

The process begins by preprocessing the geochemical data. Each of the 39 elements is interpolated onto a 1 km × 1 km grid using Kriging interpolation. Small cubes are then cropped from this 3D grid and fed into a GK-DCN for feature extraction and anomaly recognition. As a supervised algorithm, the GK-DCN requires a dataset labelled with known anomalies (positive samples) and background (negative samples). Geochemical anomalies that deviate from regional patterns are key indicators of mineral deposits (Cheng, 2012). To model these anomalies, favorable areas were defined as 3×3 grid blocks centered on known Au-Cu deposits. From each central grid, a 9×9 cell patch was extracted, generating 84 positive samples representing mineralized areas. Negative samples were selected according to the following criteria (Carranza et al., 2008; Nykänen et al., 2015): (i) random non-deposit locations should be distant from known mineral deposit positions and favourable ore-controlling factors (e.g., faults); (ii) the number of selected non-deposit sites should match the number of known mineral deposit sites. The similar strategy was used for negative sample augmentation generating 84 patches. The dataset was split 8:2 for training and validation, resulting in a final input training data cube of dimensions $134 \times 9 \times 9 \times 39$ (67 patches per class).

4.1 Recognizing geochemical anomalies by GK-DCN

DCN and CNN exhibit significant differences in the extraction of geochemical anomalies. By introducing deformable convolution modules, DCN gains the ability to adaptively adjust the shape and size of receptive fields. Through the incorporation of learnable offset parameters, the convolutional kernels of DCN can dynamically deform based on the characteristics of the input data, learning the complex spatial distribution and structural features of geochemical elements. This allows the model to actively focus on the spatial anisotropy of geochemical anomalies, effectively capturing irregular anomaly patterns controlled by geological factors such as faults. The extracted anomaly boundaries show higher consistency with known ore-forming geological bodies and exhibit stronger spatial continuity (Fig. 6a). In contrast, CNN is constrained by its fixed geometric structure, leading to insufficient responsiveness to irregular boundaries. Its extraction results tend to be overly smooth, with significant loss of anomaly information (Fig. 6b). Comparative results demonstrate that DCN holds clear advantages in improving the spatial positioning accuracy of anomalies and their relevance to geological factors, providing more reliable geochemical indicators for mineral exploration. In summary, DCN significantly enhances the ability to represent the nonlinear and anisotropic characteristics of geochemical spatial distributions through its adaptive mechanism.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f06

Figure 6Geochemical anomalies associated with mineralization obtained by (a) DCN and (b) CNN.

Reflecting the geological setting where faults provided fluid pathways and deposition sites for Au-Cu mineralization, the quantified spatial relationship between ore-controlling faults and known deposits (Fig. 7) was incorporated into the DCN and CNN's loss function (Fig. 3). A non-linear controlling function between prospectivity density ρ and d was fitted: $ρ = 3.28 d^{- 0.53}$ . The d was the distance, and ρ was normalized for building a geologically constrained loss term. By incorporating geological constraints constructed from prior knowledge of fault-related mineralization to guide the training of both DCN and CNN, thus generating the GK-DCN and GK-CNN models. These models not only thoroughly learn the spatial distribution patterns and combinatorial relationships of geochemical elements but also strengthen their understanding of the geological background. This effectively suppresses background and noise interference unrelated to mineralization. The results show that compared to traditional methods, the anomalies extracted by the geologically constrained models exhibit higher spatial structural consistency with known mineralized fault structures, and the anomaly concentration centers are more prominent (Fig. 8). This approach significantly reduces the multiplicity of solutions in anomaly recognition and enhances the reliability and geological interpretability of the anomaly results.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f07

Figure 7(a) Log–log plots between the density of mineral deposits ρ and the distance from faults. (b) Density value for faults in the case area.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f08

Figure 8Geochemical anomalies associated with mineralization obtained by (a) GK-DCN and (b) GK-CNN.

To compare the feature extraction capabilities of CNN and DCN in the identification of geochemical anomalies, this study visualizes the offsets and employs Grad-CAM technology to visualize the spatial features learned by both types of models. These features are compared with the geochemical patterns, which are obtained by integrating multiple geochemical variables via Robust Principal Component Analysis (RPCA). PC1 vs. PC2 plots for the 39 elements (Fig. 9a) reveal two distinct compositional assemblages. The assemblage characterized by positive loadings of PC2 (Au, Cu, As, Hg, Bi, Mo, W, Co, Pb, Zn and Ni) (Fig. 9a) corresponds to Au–Cu mineralization in the region. The spatial distribution of PC2 scores (Fig. 9b) shows that high values, associated with this mineralization-related assemblage, correlate with areas of Au–Cu mineralization.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f09

Figure 9(a) Biplots of the PC1 and PC2 obtained by the RPCA methods, (b) Map showing the spatial distribution of the PC2 score related to mineralization.

As mentioned above, offsets are the core idea of deformable convolution. By introducing a parallel “offset prediction” structure, the network learns the shape and size of the receptive field on its own. For each sampling point of standard convolution, the network additionally learns two values (Δx, Δy), representing its offsets in the x and y directions. Thus, the actual sampling positions are no longer regular grid points but new positions formed by the original locations plus the predicted offsets. These new positions may distribute along the actual contours of the target object, thereby capture more precise features. Figure 10 illustrates the geochemical patterns corresponding to ten mineral deposits clipped from PC2 score maps, as well as the offsets direction and magnitude learned by the DCN for ten irregular spatial patterns. For irregular spatial patterns, deformable convolution adjusts the sampling positions of the convolution kernel through offsets. For each position of the convolution kernel, the deformable network adds their corresponding offsets to the original grid points, resulting in new sampling positions that pull the originally regular sampling points to more effective locations. The arrows pointing from the original grid points to the new sampling points represent the direction and magnitude of the offsets. Both the direction and magnitude of the offsets indicate that, during the training process, the actual sampling positions of the deformable convolution significantly shift toward areas with higher concentrations of geochemical elements. This demonstrates that the network is more capable of adapting to the quantification and extraction of irregular geochemical spatial patterns.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f10

Figure 10Comparison of offset direction and magnitude maps obtained by GK-CNN and GK-DCN with the geochemical patterns of ten mineral deposits clipped from PC2 score maps in this study. The yellow in the maps represent high concentration and high offset magnitude. The longer the arrow in the offset direction maps, the greater the offset.

Download

CAM is a visualization technique used to reveal the image regions that DCN and CNN focus on when making decisions. It generates a heatmap by taking the feature maps of the last convolutional layer and performing a weighted summation. Bright areas indicate regions critical for predicting a specific class. The limitation of CAM is that it requires the network architecture to include a global average pooling layer. Grad-CAM is a generalization and enhancement of CAM. It overcomes the structural constraints of CAM by computing the gradients of the target class with respect to the feature maps of the last convolutional layer to obtain weights, generating a heatmap that localizes key regions of the image. This heatmap visually demonstrates which features the model focuses on to make predictions, thereby enhancing the model's interpretability. It allows us to intuitively understand the basis of the model's decisions and verify whether it focuses on reasonable features. Figure 10 displays the geochemical patterns corresponding to ten mineral deposits clipped from PC2 score maps, along with the Grad-CAM maps generated by the GK-CNN and GK-DCN models. As can be seen, GK-DCN, with their ability to adaptively adjust receptive fields, generate Grad-CAM maps that more accurately align with the spatial distribution patterns of actual geochemical spatial patterns. This indicates that the deformable network's ability to adjust the sampling locations of convolutional operations through offset modulation, which allows it to effectively capture complex and irregular geochemical patterns. Consequently, the deformable network demonstrates greater flexibility and accuracy in identifying geochemical spatial patterns. Their heatmaps clearly outline the spatially anisotropic distribution of geochemical fields, exhibiting higher spatial coupling with actual geochemical spatial patterns, and enhance the interpretability of model decisions (Fig. 11).

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f11

Figure 11Comparison of Grad-CAM maps obtained by GK-CNN and GK-DCN with the geochemical patterns of ten mineral deposits clipped from PC2 score maps in this study. The yellow in the geochemical patterns of mineral deposits represent high concentration. The red highlighted regions in the Grad-CAM maps are the parts where models give more weight and contribute more the final classification. The black crosses represent the known deposits.

Download

4.2 Comparative experiments

This section assesses the performance of our proposed model using seven metrics – Accuracy (ACC), Area Under the Curve (AUC), Kappa, Matthews Correlation Coefficient (MCC), Precision, Recall, and F₁, and compares it with models that are either non-geologically constrained or do not employ deformable convolution operation. The aim is to identify and interpret the performance differences (Chicco and Jurman, 2020; Powers, 2020). The metrics are defined as follows:

\begin{matrix} (15) & ACC = \frac{TP + TN}{TP + FP + TN + FN}, \\ (16) & \begin{aligned} Kappa = \frac{ACC - P_{e}}{1 - P_{e}}; \\ P_{e} = \frac{\frac{n}{2} (TP + FP) + \frac{n}{2} (FN + FTN)}{n^{2}}, \end{aligned} \\ (17) & \begin{aligned} MCC = \\ \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP) \times (TP + FN) \times (TN + FP) \times (TN + FN)}}, \end{aligned} \\ (18) & Precision = \frac{TP}{TP + FP}, \\ (19) & Recall = \frac{TP}{TP + FN}, \\ (20) & F_{1} = \frac{2 \times TP}{2 \times TP + FP + FN} . \end{matrix}

Here, true positive (TP), true negative (TN), false positive (FP), and false negative (FN) represent the agreement between the actual labels (true or false) and the classifier's predictions (positive or negative), with n denoting the total number of samples. The AUC corresponds to the area under the Receiver Operating Characteristic (ROC) curve, expressed as a proportion of the total area of the unit square. The ROC curve plots the true positive rate (TPR, or sensitivity) against the false positive rate (FPR, or 1 − specificity) (Fawcett, 2006).

Since the augmented training set (134 samples) is likely insufficient to support the generalization capability of the DCN and CNN with high-dimensional features (39 channels), and may increase the risk of overfitting. In order to verify whether these training data are sufficient to train the model without overfitting, we employed five-fold cross-validation (Kim et al., 2003; Lessmann et al., 2008). The dataset was randomly partitioned into five equally sized subsets. In each iteration, one subset served as the testing set, while the remaining four were used for training. This process was repeated five times so that every subset was used exactly once for testing. The final performance was evaluated by averaging the results from all five iterations. The results show that the predictive performance of the model on the testing set remains consistently high across all folds, with minimal fluctuations. This indicates that the model is well-trained without overfitting, demonstrating that the data are effective and sufficient for model training (Fig. 12).

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f12

Figure 12Results of five-fold cross-validation for models CNN, DCN, GK-CNN, and GK-DCN across different performance metrics.

Download

Figure 13 is a comparative performance analysis of CNN and DCN in geochemical anomaly recognition tasks based on seven performance metrics. The radar chart comparison clearly shows that the DCN outperforms the standard CNN in the vast majority of performance metrics, demonstrating superior overall performance. In terms of recognition accuracy and reliability, DCN exhibits significant advantages. Its higher accuracy indicates a stronger overall prediction correctness and greater certainty in positive class predictions. In terms of model discriminative ability and error control, DCN also leads. Its larger AUC indicates a stronger ability to distinguish between positive and negative samples. Additionally, DCN's lower false positive rate (FPR) means fewer false alarms where normal samples are misclassified as anomalies, which is crucial in practical applications emphasizing safety and efficiency. In summary, due to its deformable convolutional structure, DCN can adaptively adjust the receptive field and more accurately capture the irregular and complex spatial features of anomalies. This enables a comprehensive outperformance over traditional CNN across most of metrics, particularly in reducing missed detections (high recall) and lowering false alarms (low FPR). This demonstrates DCN's stronger applicability and robustness for complex anomaly recognition tasks (Fig. 13).

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f13

Figure 13Evaluation of model performance between CNN and DCN in Accuracy, Precision, Recall, F₁-score, AUC, Kappa, and MCC.

Download

The model must prioritize not only accuracy but also geological consistency. The radar plot compares its geologically constrained counterpart (GK-CNN and GK-DCN) (Fig. 14). While both models demonstrate excellent predictive capabilities, GK-CNN and GK-DCN, which incorporates geological knowledge directly within the model architecture, outperformed the unconstrained CNN and DCN. This is evident in key metrics, where the knowledge-enhanced model achieved higher performance while successfully integrating geological constraints. The experimental results demonstrate that incorporating geological knowledge as a physics-based regularization term within the loss function significantly boosts pattern recognition performance and model interpretability. This geologically constrained model effectively identifies potential mineral deposits by guiding training optimization to recognize anomalies associated with ore-controlling faults, enhancing learning and generalization capabilities.

https://gmd.copernicus.org/articles/19/2219/2026/gmd-19-2219-2026-f14

Figure 14Evaluation of model performance between GK-CNN and GK-DCN in Accuracy, Precision, Recall, F₁-score, AUC, Kappa, and MCC.

Download

A primary limitation of this work lies in the interpretability and completeness of the incorporated geological knowledge. The current model leverages observed ore-controlling faults as a geological constraint within the loss function. This approach, while enhancing the model's focus on anomalies related to observable structures, inherently introduces a significant bias. It assumes the spatial patterns of geochemical anomalies are predominantly controlled by these mapped faults. Consequently, in exploration scenarios for covered areas or deep-seated mineral deposits, such as in southern Tianshan Au-Cu polymetallic ore district, the model's performance is constrained. It may suppress or fail to recognize valid geochemical anomalies that are spatially associated with blind or concealed faults, which are not observable at the surface but are equally crucial ore-controlling factors. This limits the model's generalizability and predictive power in covered regions where observed structures are not fully indicative of subsurface controls.

5 Conclusions

This study introduces deformable convolutional neural networks into the field of geochemical anomaly identification to address the issues in capturing irregularly shaped anomalies within complex geological backgrounds. The adaptive receptive field adjustment capability of deformable convolution units enables more precise capture of the spatial distribution characteristics of geochemical anomalies in complex geological settings. This enhances the model's ability to learn and represent geochemical spatial distribution features, thereby achieves superior anomaly identification results. Experimental results demonstrate that, compared to conventional convolutional neural networks, this method significantly improves accuracy and spatial continuity in anomaly identification, allowing more effective separation of mineralization-related anomalous information from high-dimensional, nonlinear geochemical data.

Prior knowledge of ore-controlling faults is incorporated into the model's loss function as a constraint. The fault-constrained loss function effectively guides the network's learning process, resulting in the identified geochemical anomalies exhibiting higher spatial alignment with known ore-controlling faults. This enhances the geological significance of the anomalies, reduces interference from the geochemical background field, and improves the accuracy of anomaly identification.

The interpretability of the model is further examined through visualizations of the learned offsets and Gradient-weighted Class Activation Mapping. First, the visualization of the offset fields learned by the deformable convolution kernels clearly reveals the network's adaptive receptive field adjustment behavior. The learned offset vectors effectively point to key anomalous spatial structures and irregular trends in the geological mineralization process, serving as important quantitative indicators of anomaly irregularity. Second, Gradient-weighted Class Activation Mapping intuitively demonstrates the key regions, where the model focused during decision-making. The highlighted areas in the heatmap show strong overlap with known mineral deposits and high anomaly zones, providing compelling evidence for the “black-box” decision-making process and demonstrating the model's focus on geochemical features related to mineralization. In summary, this study not only validates the effectiveness of combining deformable convolution with geological prior knowledge in geochemical anomaly identification but also provides a window into understanding the model's decision-making process through offset and Grad-CAM visualizations, significantly enhancing the accuracy and interpretability of AI models in geochemical data processing. This method offers a new tool for deep learning-driven geochemical data analysis and holds practical value for future geochemical exploration.

Code and data availability

The code and data used for geochemical anomaly pattern recognition based on the geological knowledge guided deformable convolution network are archived on Zenodo (https://doi.org/10.5281/zenodo.18454129, Aleksiyu, 2026; Zhang et al., 2026).

Author contributions

XZ: Conceptualization, Methodology, Writing – original draft. YX: Conceptualization, Resources, Methodology, Writing – original draft. ZC: Methodology.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

We thank two anonymous reviewers for their valuable comments and suggestions which improve this study.

Financial support

This research was supported by the Key Research and Development Program of Xinjiang Uygur Autonomous Region, China (grant no. 2024B03010-3), and the National Key Research and Development Program of China (grant nos. 22023YFC2906400, 22023YFC2906404).

Review statement

This paper was edited by Evangelos Moulas and reviewed by two anonymous referees.

References

Aleksiyu: Aleksiyu/Recognizing-geochemical-spatial-patterns-using-DCN-guided-with-geological-knowledge: v.1.1.1 (v.1.1.1), Zenodo [code], https://doi.org/10.5281/zenodo.18454129, 2026.

Carranza, E. J. M., Hale, M., and Faassen, C.: Selection of coherent deposit-type locations and their application in data-driven mineral prospectivity mapping, Ore Geol. Rev., 33, 536–558, https://doi.org/10.1016/j.oregeorev.2007.07.001, 2008.

Chen, L., Guan, Q., Feng, B., Yue, H., Wang, J., and Zhang, F.: A multi-convolutional autoencoder approach to multivariate geochemical anomaly recognition, Minerals, 9, 270, https://doi.org/10.3390/min9050270, 2019.

Chen, Y., Pirajno, F., Li, N., Guo, D., and Lai, Y.: Epithermal deposits in North Xinjiang, NW China, Int. J. Earth Sci., 101, 889–917, https://doi.org/10.1007/s00531-011-0689-4, 2012.

Chen, Z. and Zuo, R.: Geological-knowledge-guided graph self-supervised pretraining framework for identifying mineralization-related geochemical anomalies, Comput. Geosci., 199, 105913, https://doi.org/10.1016/j.cageo.2025.105913, 2025.

Cheng, Q.: Mapping singularities with stream sediment geochemical data for prediction of undiscovered mineral deposits in Gejiu, Yunnan Province, China, Ore Geol. Rev., 32, 314–324, https://doi.org/10.1016/j.oregeorev.2006.10.002, 2007.

Cheng, Q.: Singularity theory and methods for mapping geochemical anomalies caused by buried sources and for predicting undiscovered mineral deposits in covered areas, J. Geochem. Explor., 122, 55–70, https://doi.org/10.1016/j.gexplo.2012.07.007, 2012.

Cheng, Q. and Zhao, P.: Singularity theories and methods for characterizing mineralization processes and mapping geo-anomalies for mineral deposit prediction, Geosci. Front., 2, 67–79, https://doi.org/10.1016/j.gsf.2010.12.003, 2011.

Cheng, Q., Agterberg, F. P., and Ballantyne, S. B.: The separation of geochemical anomalies from background by fractal methods, J. Geochem. Explor., 51, 109–130, https://doi.org/10.1016/0375-6742(94)90013-2, 1994.

Cheng, Q., Xu, Y., and Grunsky, E.: Integrated spatial and spectrum method for geochemical anomaly separation, Nat. Resour. Res., 9, 43–52, https://doi.org/10.1023/A:1010109829861, 2000.

Cheng, Q., Bonham-Carter, G., Wang, W., Zhang, S., Li, W., and Xia, Q.: A spatially weighted principal component analysis for multi-element geochemical data for mapping locations of felsic intrusions in the Gejiu mineral district of Yunnan, China, Comput. Geosci., 37, 662–669, https://doi.org/10.1016/j.cageo.2010.11.001, 2011.

Chicco, D. and Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation, BMC Genom., 21, 6, https://doi.org/10.1186/s12864-019-6413-7, 2020.

Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., and Wei, Y.: Deformable convolutional networks, In Proceedings of the IEEE International Conference on Computer Vision, 764–773, https://doi.org/10.48550/arXiv.1703.06211, 2017.

Fawcett, T.: An introduction to ROC analysis, Pattern Recogn. Lett., 27, 861–874, https://doi.org/10.1016/j.patrec.2005.10.010, 2006.

Gao, J., Long, L., Klemd, R., Qian, Q., Liu, D., Xiong, X., Su, W., Liu, W., Wang, Y., and Yang, F.: Tectonic evolution of the South Tianshan orogen and adjacent regions, NW China: geochemical and age constraints of granitoid rocks, Int. J. Earth Sci., 98, 1221–1238, https://doi.org/10.1007/s00531-008-0370-8, 2009.

Gao, L., Hong, H., Liang, D., and Min, F.: Surface wave suppression through deformable convolutional wavelet transform network with residual dense blocks, Acta Geophys., 72, 4151–4167, https://doi.org/10.1007/s11600-024-01339-x, 2024.

Gilpin, L. H., Bau, D., Yuan, B. Z., Bajwa, A., Specter, M., and Kagal, L.: Explaining explanations: An overview of interpretability of machine learning, in: 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), 80–89, https://doi.org/10.1109/DSAA.2018.00018, 2018.

Guan, J., Lai, R., Lu, Y., Li, Y., Li, H., Feng, L., Yang, Y., and Gu, L.: Memory-efficient deformable convolution based joint denoising and demosaicing for UHD images, IEEE T. Circ. Syst. Vid., 32, 7346–7358, https://doi.org/10.1109/TCSVT.2022.3182990, 2022.

Goldfarb, R. J., Taylor, R. D., Collins, G. S., Goryachev, N. A., and Orlandini, O. F.: Phanerozoic continental growth and gold metallogeny of Asia, Gondwana Res., 25, 48–102, https://doi.org/10.1016/j.gr.2013.03.002, 2014.

Gong, L. and Cheng, Q.: Exploiting edge features for graph neural networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9211–9219, https://doi.org/10.48550/arXiv.1809.02709, 2019.

Hajihosseinlou, M., Maghsoudi, A., and Ghezelbash, R.: Intelligent mapping of geochemical anomalies: Adaptation of DBSCAN and mean-shift clustering approaches, J. Geochem. Explor., 258, 107393, https://doi.org/10.1016/j.gexplo.2024.107393, 2024.

Huang, D., Zuo, R., and Wang, J.: Geochemical anomaly identification and uncertainty quantification using a Bayesian convolutional neural network model, Appl. Geochem., 146, 105450, https://doi.org/10.1016/j.apgeochem.2022.105450, 2022.

Jimenez-Espinosa, R., Sousa, A. J., and Chica-Olmo, M.: Identification of geochemical anomalies using principal component analysis and factorial kriging analysis, J. Geochem. Explor., 46, 245–256, https://doi.org/10.1016/0375-6742(93)90024-G, 1993.

Jung, H. and Oh, Y.: Towards better explanations of class activation mapping, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1336–1344, https://doi.org/10.48550/arXiv.2102.05228, 2021.

Han, B., He, G., Wang, X., and Guo, Z.: Late Carboniferous collision between the Tarim and Kazakhstan–Yili terranes in the western segment of the South Tian Shan Orogen, Central Asia, and implications for the Northern Xinjiang, western China, Earth-Sci. Rev., 109, 74–93, https://doi.org/10.1016/j.earscirev.2011.09.001, 2011.

Kim, E., Kim, W., and Lee, Y.: Combination of multiple classifiers for the customer's purchase behavior prediction, Decis. Support Syst., 34, 167–175, https://doi.org/10.1016/S0167-9236(02)00079-9, 2003.

LeCun, Y. and Bengio, Y.: Convolutional networks for images, speech, and time series, The handbook of brain theory and neural networks, https://doi.org/10.7551/mitpress/3413.001.0001, 1998.

Lessmann, S., Baesens, B., Mues, C., and Pietsch, S.: Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE T. Softw. Eng., 34, 485–496, https://doi.org/10.1109/TSE.2008.35, 2008.

Liang, Z., Xiong, Y., and Zuo, R.: Coupling Graph Attention Networks with Causal Discovery for Geochemical Anomaly Recognition, Math. Geosci., 57, 1665–1693, https://doi.org/10.1007/s11004-025-10218-0, 2025.

Liu, H., Li, Z., Lin, S., and Cheng, L.: Remote sensing image denoising based on deformable convolution and attention-guided filtering in progressive framework, Signal, Image Video Process., 18, 8195–8205, https://doi.org/10.1007/s11760-024-03461-1, 2024.

Liu, Y., Wang, W., Li, Q., Min, M., and Yao, Z.: DCNet: A deformable convolutional cloud detection network for remote sensing imagery, IEEE Geosci. Remote Sens. Lett., 19, 1–5, https://doi.org/10.1109/LGRS.2021.3086584, 2021.

Luo, Y., Liu, X., Meng, H., Ye, Y., and Wu, B.: Multitask Seismic Inversion Based on Deformable Convolution and Generative Adversarial Network, IEEE Geosci. Remote Sens. Lett., 21, 1–5, https://doi.org/10.1109/LGRS.2024.3388213, 2024.

Luo, Z., Zuo, R., Xiong, Y., and Zhou, B.: Metallogenic-factor variational autoencoder for geochemical anomaly detection by ad-hoc and post-hoc interpretability algorithms, Nat. Resour. Res., 32, 835–853, https://doi.org/10.1007/s11053-023-10200-9, 2023.

Na, Z., Guo, Z., and Zhu, Y.: Soil Moisture Monitoring Based on Deformable Convolution Unit Net Algorithm Combined with Water Area Changes, Electronics, 14, 1011, https://doi.org/10.3390/electronics14051011, 2025.

Nielsen, A. H., Iosifidis, A., and Karstoft, H.: Forecasting large-scale circulation regimes using deformable convolutional neural networks and global spatiotemporal climate data, Sci. Rep., 12, 8395, https://doi.org/10.1038/s41598-022-12167-8, 2022.

Nykänen, V., Lahti, I., Niiranen, T., and Korhonen, K.: Receiver operating characteristics (ROC) as validation tool for prospectivity models – A magmatic Ni–Cu case study from the Central Lapland Greenstone Belt, Northern Finland, Ore Geol. Rev., 71, 853–860, https://doi.org/10.1016/j.oregeorev.2014.09.007, 2015.

Pirajno, F.: Hydrothermal processes and mineral systems. Springer Science & Business Media, https://doi.org/10.1007/978-1-4020-8613-7, 2008.

Powers, D. M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, arXiv [preprint], https://doi.org/10.48550/arXiv.2010.16061, 2020.

Reis, A. P., Sousa, A. J., and Fonseca, E. C.: Application of geostatistical methods in gold geochemical anomalies identification (Montemor-O-Novo, Portugal), J. Geochem. Explor., 77, 45–63, https://doi.org/10.1016/S0375-6742(02)00269-8, 2003.

Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nat. Mach. Intell., 1, 206–215, https://doi.org/10.1038/s42256-019-0048-x, 2019.

Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Monfardini, G.: The graph neural network model, IEEE T. Neural Networ., 20, 61–80, https://doi.org/10.1109/TNN.2008.2005605, 2008.

Selvaraju, R. R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., and Batra, D.: Grad-CAM: Why did you say that?, arXiv [preprint], https://doi.org/10.48550/arXiv.1611.07450, 2016.

Seltmann, R., Porter, T. M., and Pirajno, F.: Geodynamics and metallogeny of the central Eurasian porphyry and related epithermal mineral systems: A review, J. Asian Earth Sci., 79, 810–841, https://doi.org/10.1016/j.jseaes.2013.03.030, 2014.

Semenov, A., Boginski, V., and Pasiliao, E. L.: Neural networks with multidimensional cross-entropy loss functions, International Conference on Computational Data and Social Networks, Cham, Springer International Publishing, 57–62, https://doi.org/10.1007/978-3-030-34980-6_5, 2019.

Soltani, Z., Hassani, H., and Esmaeiloghli, S.: A deep autoencoder network connected to geographical random forest for spatially aware geochemical anomaly detection, Comput. Geosci., 190, 105657, https://doi.org/10.1016/j.cageo.2024.105657, 2024.

Sun, X., Mo, T., Song, J., and Wang, B.: Deformable convolution kernel and residual learning assisted irregular seismic data interpolation, IEEE T. Geosci. Remote, 62, 1–11, https://doi.org/10.1109/TGRS.2024.3360449, 2024.

Tian, J., Guo, X., Liu, W., Tao, D., and Liu, B.: Deformable convolutional network constrained by contrastive learning for underwater image enhancement, IEEE Geosci. Remote Sens. Lett., 20, 1–5, https://doi.org/10.1109/LGRS.2023.3299613, 2023.

Tian, M., Wang, X., Nie, L., and Zhang, C.: Recognition of geochemical anomalies based on geographically weighted regression: A case study across the boundary areas of China and Mongolia, J. Geochem. Explor., 190, 381–389, https://doi.org/10.1016/j.gexplo.2018.04.003, 2018.

Wang, H. and Zuo, R.: A comparative study of trend surface analysis and spectrum–area multifractal model to identify geochemical anomalies, J. Geochem. Explor., 155, 84–90, https://doi.org/10.1016/j.gexplo.2015.04.013, 2015.

Wang, H., Cheng, Q., and Zuo, R.: Quantifying the spatial characteristics of geochemical patterns via GIS-based geographically weighted statistics, J. Geochem. Explor., 157, 110–119, https://doi.org/10.1016/j.gexplo.2015.06.004, 2015.

Wang, J. and Zuo, R.: An extended local gap statistic for identifying geochemical anomalies, J. Geochem. Explor., 164, 86–93, https://doi.org/10.1016/j.gexplo.2016.01.002, 2016.

Wang, W., Zhao, J., and Cheng, Q.: Fault trace-oriented singularity mapping technique to characterize anisotropic geochemical signatures in Gejiu mineral district, China, J. Geochem. Explor., 134, 27–37, https://doi.org/10.1016/j.gexplo.2013.07.009, 2013.

Wang, W., Cheng, Q., Zhang, S., and Zhao, J.: Anisotropic singularity: A novel way to characterize controlling effects of geological processes on mineralization, J. Geochem. Explor., 189, 32–41, https://doi.org/10.1016/j.gexplo.2017.07.019, 2018.

Wang, X., Zhang, Q., and Zhou, G.: National-scale geochemical mapping projects in China, Geostand. Geoanal. Res., 31, 311–320, https://doi.org/10.1111/j.1751-908X.2007.00128.x, 2007.

Wu, Z., Paoletti, M. E., Su, H., Tao, X., Han, L., Haut, J. M., and Plaza, A.: Background-guided deformable convolutional autoencoder for hyperspectral anomaly detection, IEEE T. Geosci. Remote, 61, 1–16, https://doi.org/10.1109/TGRS.2023.3334562, 2023.

Xiao, F., Chen, J., Zhang, Z., Wang, C., Wu, G., and Agterberg, F. P.: Singularity mapping and spatially weighted principal component analysis to identify geochemical anomalies associated with Ag and Pb-Zn polymetallic mineralization in Northwest Zhejiang, China, J. Geochem. Explor., 122, 90–100, https://doi.org/10.1016/j.gexplo.2012.04.010, 2012.

Xiao, F., Chen, J., Hou, W., Wang, Z., Zhou, Y., and Erten, O.: A spatially weighted singularity mapping method applied to identify epithermal Ag and Pb-Zn polymetallic mineralization associated geochemical anomaly in Northwest Zhejiang, China, J. Geochem. Explor., 189, 122–137, https://doi.org/10.1016/j.gexplo.2017.03.017, 2018.

Xiao, F., Wang, K., Hou, W., and Erten, O.: Identifying geochemical anomaly through spatially anisotropic singularity mapping: A case study from silver-gold deposit in Pangxidong district, SE China, J. Geochem. Explor., 210, 106453, https://doi.org/10.1016/j.gexplo.2019.106453, 2020.

Xie, X., Mu, X., and Ren, T.: Geochemical mapping in China, J. Geochem. Explor., 60, 99–113, https://doi.org/10.1016/S0375-6742(97)00029-0, 1997.

Xiong, Y., Zuo, R., Luo, Z., and Wang, X.: A physically constrained variational autoencoder for geochemical pattern recognition, Math. Geosci., 54, 783–806, https://doi.org/10.1007/s11004-021-09979-1, 2022.

Xu, L., Zhang, X., Yu, H., Chen, Z., Du, W., and Chen, N.: Incorporating spatial autocorrelation into deformable ConvLSTM for hourly precipitation forecasting, Comput. Geosci., 184, 105536, https://doi.org/10.1016/j.cageo.2024.105536, 2024a.

Xu, Y. and Zuo, R.: Spatial-spectrum two-branch model based on a superpixel graph convolutional network and 1DCNN for geochemical anomaly identification, Math. Geosci., 57, 307–331, https://doi.org/10.1007/s11004-024-10158-1, 2025.

Xu, Y., Zuo, R., and Zhang, G.: The graph attention network and its post-hoc explanation for recognizing mineralization-related geochemical anomalies, Appl. Geochem., 155, 105722, https://doi.org/10.1016/j.apgeochem.2023.105722, 2023.

Xu, Y., Shi, L., and Zuo, R.: Geologically constrained unsupervised dual-branch deep learning algorithm for geochemical anomalies identification, Appl. Geochem., 174, 106137, https://doi.org/10.1016/j.apgeochem.2024.106137, 2024b.

Xu, Y., Zuo, R., Chen, Z., Shi, Z. and Kreuzer, O. P.: Recent advances and future research directions in deep learning as applied to geochemical mapping, Earth-Sci. Rev., 105209, https://doi.org/10.1016/j.earscirev.2025.105209, 2025a.

Xu, Y., Zuo, R., and Bai, Y.: Geological knowledge-guided dual-branch deep learning model for identification of geochemical anomalies related to mineralization, J. Geophys. Res.-Mach. Learn., 2, e2024JH000468, https://doi.org/10.1029/2024JH000468, 2025b.

Xue, C., Zhao, X., Mo, X., Gu, X., Zhang, Z., Wang, X., Zu, B., Zhang, G., Feng, B., Liu, J., Dong, L., Bakhtiar, N., and Nikolay, P.: Asian Gold Belt in western Tianshan and its dynamic setting, metallogenic control and exploration, Earth Sci. Front., 21, 128–155, https://doi.org/10.19762/j.cnki.dizhixuebao.2023114, 2014.

Yang, F., Zuo, R., Xiong, Y., Wang, J., and Zhang, G.: An interpretable attention branch convolutional neural network for identifying geochemical anomalies related to mineralization, J. Geochem. Explor., 252, 107274, https://doi.org/10.1016/j.gexplo.2023.107274, 2023.

Yin, B., Zuo, R., Xiong, Y., Li, Y., and Yang, W.: Knowledge discovery of geochemical patterns from a data-driven perspective, J. Geochem. Explor., 231, 106872, https://doi.org/10.1016/j.gexplo.2021.106872, 2021.

Yu, D., Ji, S., Li, X., Yuan, Z., and Shen, C.: Earthquake crack detection from aerial images using a deformable convolutional neural network, IEEE T. Geosci. Remote, 60, 1–12, https://doi.org/10.1109/TGRS.2022.3183157, 2022.

Yu, H., Wang, R., Li, P., and Zhang, P.: Flood detection in polarimetric SAR data using deformable convolutional vision model, Water, 15, 4202, https://doi.org/10.3390/w15244202, 2023.

Zhang, C., Zuo, R., and Xiong, Y.: Detection of the multivariate geochemical anomalies associated with mineralization using a deep convolutional neural network and a pixel-pair feature method, Appl. Geochem., 130, 104994, https://doi.org/10.1016/j.apgeochem.2021.104994, 2021.

Zhang, S., Xiao, K., Carranza, E. J. M., Yang, F., and Zhao, Z.: Integration of auto-encoder network with density-based spatial clustering for geochemical anomaly detection for mineral exploration, Comput. Geosci., 130, 43–56, https://doi.org/10.1016/j.cageo.2019.05.011, 2019.

Zhang, T., Zhang, W., Cao, D., Yi, Y., and Wu, X.: A new deep learning neural network model for the identification of InSAR anomalous deformation areas, Remote Sens., 14, 2690, https://doi.org/10.3390/rs14112690, 2022.

Zhang, X., Xiong, Y., and Chen, Z.: Recognizing geochemical spatial patterns using deformable convolutional networks guided with geological knowledge, GitHub [code], https://github.com/Aleksiyu/Recognizing-geochemical-spatial-patterns-using-DCN-guided-with-geological-knowledge (last access: 8 February 2026), 2026.

Zhao, C., Zhu, W., and Feng, S.: Hyperspectral image classification based on kernel-guided deformable convolution and double-window joint bilateral filter, IEEE Geosci. Remote Sens. Lett., 19, 1–5, https://doi.org/10.1109/LGRS.2021.3084203, 2021.

Zhao, H., Zhou, Y., Bai, T., and Chen, Y.: A U-Net based multi-scale deformable convolution network for seismic random noise suppression, Remote Sens., 15, 4569, https://doi.org/10.3390/rs15184569, 2023.

Zhao, X., Xue, C., Seltmann, R., Dolgopolova, A., Andersen, J. C., and Zhang, G.: Volcanic–plutonic connection and associated Au–Cu mineralization of the Tulasu ore district, Western Tianshan, NW China: Implications for mineralization potential in Palaeozoic arc terranes, Geol. J., 55, 2318–2341, https://doi.org/10.1002/gj.3750, 2020.

Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., and Sun, M.: Graph neural networks: A review of methods and applications, AI Open, 1, 57–81, https://doi.org/10.1016/j.aiopen.2021.01.001, 2020.

Zhu, J., Fang, L., and Ghamisi, P.: Deformable convolutional neural networks for hyperspectral image classification, IEEE Geosci. Remote Sens. Lett., 15, 1254–1258, https://doi.org/10.1109/LGRS.2018.2830403 2018.

Zhu, X., Hu, H., Lin, S., and Dai, J.: Deformable convnets v2: More deformable, better results, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9308–9316, https://doi.org/10.48550/arXiv.1811.11168, 2019.

Zuo, R.: A nonlinear controlling function of geological features on magmatic–hydrothermal mineralization, Sci. Rep., 6, 27127, https://doi.org/10.1038/srep27127, 2016.

Zuo, R.: Machine learning of mineralization-related geochemical anomalies: A review of potential methods, Nat. Resour. Res., 26, 457–464, https://doi.org/10.1007/s11053-017-9345-4, 2017.

Zuo, R. and Xu, Y.: A physically constrained hybrid deep learning model to mine a geochemical data cube in support of mineral exploration, Comput. Geosci., 182, 105490, https://doi.org/10.1016/j.cageo.2023.105490, 2024.

Zuo, R., Wang, J., Xiong, Y., and Wang, Z.: The processing methods of geochemical exploration data: past, present, and future, Appl. Geochem., 132, 105072, https://doi.org/10.1016/j.apgeochem.2021.105072, 2021.

Zuo, R., Cheng, Q., Xu, Y., Yang, F., Xiong, Y., Wang, Z., and Kreuzer, O. P.: Explainable artificial intelligence models for mineral prospectivity mapping, Sci. China Earth Sci., 67, 2864–2875, https://doi.org/10.1007/s11430-024-1309-9, 2024.

Zuo, R., Yang, F., Cheng, Q., and Kreuzer, O. P.: A novel data-knowledge dual-driven model coupling artificial intelligence with a mineral systems approach for mineral prospectivity mapping, Geology, 53, 284–288, https://doi.org/10.1130/G52970.1, 2025.

Articles

Short summary

Geochemical anomalies associated with mineralization represent one of the most significant types of geo-anomalies for mineral exploration.This study develops a AI method that combines geological knowledge with a flexible deep learning model. It helps identify geochemical anomaly patterns more accurately and reliably by focusing on key features like ore-controlling faults. The model's decisions are easier to understand through visual explanations, increasing transparency and trust in the results.