Synergistic application of digital outcrop characterization techniques and deep learning algorithms in geological exploration

Synergistic application of digital outcrop characterization techniques and deep learning algorithms in geological exploration

UAV oblique photography technology

In the challenging field of geological exploration, the introduction of Unmanned Aerial Vehicle Tilt Photography technology (UTP) has not only significantly enhanced the efficiency of data collection but also greatly improved the quality and safety of exploration processes 31,32. This technology utilizes high-resolution cameras carried by drones to capture images of the Earth’s surface from multiple angles. Through a series of complex post-processing steps, such as multi-view image joint adjustment, dense matching of multi-view images, point cloud construction, and texture mapping, it generates high-precision three-dimensional models. These models are crucial for accurately understanding the geological structures of both the surface and subsurface, especially in situations like mineral development and disaster monitoring where rapid assessment and response are needed.

UTP is particularly suited for areas that are difficult to access or unsafe for human entry, such as mountains, canyons, or other harsh environments 33. By using drones for exploration, key data can be obtained without direct contact, significantly reducing the risk of personnel casualties and health hazards. Moreover, the use of drones substantially lowers the direct costs of geological exploration projects, as it reduces the reliance on traditional ground equipment and manpower while increasing the speed and frequency of data collection.

UTP demonstrates immense potential in the assessment of mineral resources. Tilt photography not only identifies surface mineral indications but also addresses occlusions and distortions through multi-view image joint adjustment. It can create detailed three-dimensional models of the Earth’s surface, aiding in the construction of ore body models and providing more accurate resource estimates. Dense matching of multi-view images further enhances the precision of image matching, while point cloud construction and texture mapping steps transform these data into digital terrain models with detailed layers. This results in the final three-dimensional models not only accurately reflecting the shape of terrain and features but also possessing realistic surface textures (Fig. 1).

Fig. 1
figure 1

Roadmap of Unmanned Aerial Vehicle Tilt Photography Technology.

Cesium digital outcrop characterization technology

The development of Cesium technology began in 2011, initially as a web-based 3D mapping engine. With the popularization of WebGL technology, Cesium started to support the rendering of more complex 3D terrain and building models. In the field of geology, particularly in creating 3D digital models of geological outcrops, Cesium has demonstrated its unique advantages 34,35.

As WebGL technology has evolved, Cesium is now capable of rendering increasingly complex 3D terrains and building models, especially showing unique advantages in geological applications. The process of creating a Cesium digital outcrop first involves the collection of complex geological data, including terrain and geological structures, usually achieved through remote sensing techniques and ground surveys. These data are then converted into 3D Tiles format and constructed into accurate three-dimensional geological models on the Cesium platform. These models not only include terrain elevation information but also precisely present the color and texture of rock layers 36,37. This technology, combining 3D visualization and geological data processing, enables geologists to intuitively observe and analyze complex geological structures, such as rock layers, faults, and folds, in a virtual environment. Currently, Cesium technology is widely applied in various fields, including geological exploration, geological education, and environmental science, aiding students and professionals in better understanding geological structures and processes. Despite facing challenges in data precision, large-scale data handling, and model realism, the development of Cesium digital outcrop characterization technology, bolstered by advancements in big data and artificial intelligence, has brought a new transformation to the field of geological sciences. It not only enhances the efficiency and accuracy of geological structure analysis but also opens new pathways for teaching, research, and practical application in geology 38,39. Cesium digital outcrop technology is set to play an increasingly important role in the future of geological sciences (Fig. 2).

Fig. 2
figure 2

Cesium Architecture Diagram.

VGG19 lithology identification algorithm

VGG19, as an advanced deep Convolutional Neural Network (CNN), has shown exceptional performance in the field of image recognition, particularly in lithology identification, since its inception 40,41,42. The core of this algorithm lies in its deep convolutional structure, which consists of a series of convolutional layers, activation functions (especially the ReLU function), pooling layers, and fully connected layers stacked together. In the convolutional layers, VGG19 extracts local features of the image at different depths through multiple convolutional kernels, gradually constructing a comprehensive feature map. The use of the ReLU activation function enhances the model’s ability to represent complex image features by suppressing negative signals to amplify positive features. Meanwhile, the incorporation of pooling layers, which downsample the feature map, not only reduces the number of model parameters and computational complexity but also helps the model resist minor variations in images. The fully connected layers at the end of the network synthesize the extracted features and make decisions through classifiers. The design philosophy of VGG19 is to enhance the model’s expressiveness by increasing network depth and complexity, making it perform exceptionally well in processing images of rocks with complex structures and rich textures 43,44. With its powerful feature extraction and classification capabilities, VGG19 has become an important tool in fields such as lithology identification, providing strong technical support for digital exploration in geology and related fields. Among its components, the convolutional layer is the most crucial, primarily functioning to extract features from images. The convolution operation can be represented by the following formula:

$$(I * K)(i,j) = \sum m\sum nI(m,n) \cdot K(i – m,j – n)$$

(1)

\(I\) represents the input image.

\(K\) represents the convolution kernel (or filter).

\((i,j)\) represents the pixel position in the image.

m and n represent the dimensions of the convolution kernel.

In VGG19, the ReLU (Rectified Linear Unit) is the activation function used to introduce non-linearity into the network. The purpose of the ReLU function is to set all negative values to zero while retaining positive values. Its formula is expressed as:

\(x\) represents the output of the convolutional layer.

The pooling layer is used to reduce the spatial dimensions of the feature map, decreasing the number of parameters and computational load. The most commonly used type of pooling is max pooling, which can be represented as:

$$P(i,j) = max_k,l \in [0,M – 1] (I(i + k,j + l))$$

(3)

\(P\) represents the output after pooling.

\(I\) represents the input feature map.

\(M\) represents the size of the pooling window.

\((i,j)\) represents the pixel position.

The fully connected layer is located at the end of the network and is primarily used for classification tasks. The mathematical representation of a fully connected layer is:

\(y\) is the output vector.

\(W\) is the weight matrix.

\(x\) is the input vector.

\(b\) is the bias vector.

In VGG19, the output of the last fully connected layer is classified using the softmax function, whose formula is:

$$\sigma (z)_\textj = \frac{\texte^\textz j }{{\sum _\textk \texte^\textz k }}$$

(5)

\(z\) is the input vector.

\(j\) and \(k\) are the indices of classes.

\(\sigma (z)_\textj\) is the predicted probability for the \(j\) class.

Through layers of convolution, non-linear activation, pooling, and fully connected operations, VGG19 can extract deep features from original images, effectively applying to lithology identification, classification, and other visual tasks.

link