Deep shale gas production prediction method based on geological engineering mechanism guidance and multi-model fusion

By employing a dynamic feature recalibration network and a multi-model fusion method, combined with guidance from geological engineering mechanisms, the accuracy and reliability issues of deep shale gas production capacity prediction were resolved, achieving adaptive and efficient production capacity prediction and fracturing parameter optimization.

CN122241588APending Publication Date: 2026-06-19CHENGDU UNIVERSITY OF TECHNOLOGY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHENGDU UNIVERSITY OF TECHNOLOGY
Filing Date
2026-03-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to accurately predict production capacity in deep shale gas reservoirs, especially due to insufficient model adaptability under geological heterogeneity and complex geological conditions, and the lack of guidance from geological engineering mechanisms, resulting in low reliability of prediction results.

Method used

By employing a dynamic feature recalibration network and a multi-model fusion method, guided by geological engineering mechanisms, feature weights and model contributions are dynamically adjusted, and interpretability verification is combined to achieve adaptive prediction.

Benefits of technology

It improves the accuracy and reliability of deep shale gas production capacity prediction, can adaptively adjust according to geological conditions, enhances the interpretability and engineering application value of prediction results, and supports the optimization of fracturing parameters.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241588A_ABST
    Figure CN122241588A_ABST
Patent Text Reader

Abstract

This invention discloses an intelligent prediction method and system for deep shale gas production capacity based on geological engineering mechanisms and multi-model fusion. It includes: collecting and preprocessing geological, engineering, and production data; learning adaptive weights for static features through a dynamic feature recalibration network to output recalibrated mechanism features; inputting these features and production time-series data into TCN, LSTM, Transformer, and XGBoost models to capture long-term declining trends, dynamic pressure propagation, inter-well interference, and nonlinear property mapping; inputting the recalibrated features into a gating network to generate fusion weights, weighted fusion to obtain the predicted production capacity value, and performing interpretability verification. This invention provides a fracturing optimization method based on production capacity prediction. Compared with existing technologies, this invention improves prediction accuracy and interpretability through feature recalibration, multi-model division of labor, and adaptive fusion, achieving a closed loop from production capacity prediction to fracturing optimization.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention proposes an intelligent prediction method and system for deep shale gas production capacity based on geological engineering mechanism guidance and multi-model fusion, belonging to the field of deep shale gas development engineering technology. Background Technology

[0002] As an important unconventional natural gas resource, deep shale gas production forecasting is crucial for optimizing development plans and evaluating economic benefits. However, the accuracy of current deep shale gas production forecasting still faces significant challenges.

[0003] Taking the Sichuan Basin as an example, deep shale gas reservoirs (usually buried at depths greater than 3500m) are characterized by high formation temperature, high pressure, high closure stress, complex structures, and large differences in horizontal principal stresses, making their geological and engineering conditions far more complex than those of shallow and medium-depth shale gas reservoirs. Existing development theories and engineering technologies applicable to shallow and medium-depth shale gas (buried at depths less than 3500m) are difficult to directly apply to deep shale gas reservoirs.

[0004] Specifically, the production capacity of deep shale gas is influenced by a complex interplay of static geological factors such as reservoir properties and rock mechanics parameters, and dynamic engineering factors such as fracturing scale and construction sequence, all of which jointly control the distribution of production capacity. Existing methods, at the feature processing level, often employ fixed weighting to construct mechanistic features, failing to dynamically adjust the contribution of each parameter according to specific formation conditions and thus struggling to adapt to the highly heterogeneous nature of deep shale gas reservoirs.

[0005] At the model construction level, existing methods often employ a single model (such as neural networks or support vector machines) or a simple stacking of multiple models for prediction. However, deep shale gas production capacity is affected by the coupling of multiple scale factors, including long-term declining trends, dynamic pressure propagation, inter-well disturbances, and nonlinear property mapping. A single model cannot simultaneously capture all physical processes, resulting in insufficient adaptability to complex operating conditions. At the fusion strategy level, existing multi-model methods mostly use fixed-weight averaging or simple voting mechanisms, failing to dynamically adjust the contribution of each model based on formation characteristics. This static fusion approach leads to significant fluctuations in prediction accuracy under different geological scenarios, making it difficult to achieve adaptive prediction for different reservoir conditions.

[0006] Furthermore, existing predictive models largely rely on superficial correlations with data, lacking effective guidance from geological and engineering mechanisms. This leads to learned patterns that often contradict geological understanding, resulting in poor interpretability. Predictive results are frequently difficult to verify against common geological and engineering knowledge, lacking reliable geological support and limiting their credibility in major development decisions.

[0007] Therefore, there is an urgent need for an intelligent prediction method for shale gas production capacity guided by geological engineering mechanisms. This method should be able to use geological conditions as a priori guidance for machine learning, dynamically weight features, divide the model into different parts, and have adaptive fusion and interpretable verification capabilities. This would enable accurate predictions with geological basis and meet the technical requirements for efficient development of deep shale gas. Summary of the Invention

[0008] This invention is achieved through the following technical solution:

[0009] Firstly, this paper provides an intelligent prediction method for deep shale gas production capacity based on geological engineering mechanisms and multi-model fusion, such as... Figure 1 The diagram shown is an overall flowchart of the method of the present invention, which includes:

[0010] Geological parameters, engineering parameters, and historical production data of the target well and its associated neighboring wells are obtained, and the data is preprocessed to form a standardized dataset.

[0011] The static features from the standardized dataset are input into the dynamic feature recalibration network. Figure 2 The structure of the dynamic feature recalibration module is shown. The dynamic feature recalibration network learns the adaptive weights of each static feature through a dimensionality reduction-dimensionality increase structure, and outputs the recalibrated mechanistic feature set;

[0012] The recalibrated mechanistic feature set and production time series data are respectively input into multiple machine learning prediction models. The multiple machine learning prediction models include at least TCN network, LSTM network, Transformer network and XGBoost model. The TCN network is used to capture the long-term declining trend of production capacity, the LSTM network is used to capture dynamic pressure propagation and production fluctuations, the Transformer network is used to capture inter-well interference and regional correlation, and the XGBoost model is used to capture nonlinear interactions between features. The preliminary production capacity prediction values ​​output by each model are obtained respectively.

[0013] The recalibrated mechanism feature set is input into a gated network, which outputs the dynamic fusion weights of each model. The preliminary capacity prediction values ​​are then weighted and fused to generate the final capacity prediction value.

[0014] The interpretability of the final production capacity prediction is verified, including: analyzing the correlation between the weights output by the dynamic feature recalibration network and geological parameters; analyzing the correlation between the fusion weights output by the gating network and geological parameters; and analyzing the inter-well correlations revealed by the attention weights output by the Transformer network, such as... Figure 3 The diagram shown is a schematic of the dynamic fusion gating network structure of the present invention.

[0015] Furthermore, the structure of the dynamic feature recalibration network is as follows: the input layer receives a static feature vector, the hidden layer adopts a dimensionality reduction-dimensionality increase structure, the output layer uses the Sigmoid activation function to output a weight vector, and the weight vector is multiplied element-wise with the input feature vector to obtain the recalibrated features.

[0016] Furthermore, the structure of the gated network is as follows: the input layer receives the recalibrated mechanism feature set, the hidden layer adopts a fully connected layer, and the output layer adopts the Softmax activation function to output the fusion weights of each model.

[0017] Furthermore, in the interpretability verification, Spearman rank correlation coefficient is used to quantify the correlation between the weights of the dynamic feature recalibration network output and the geological parameters, as well as the correlation between the fusion weights of the gated network output and the geological parameters.

[0018] Furthermore, the method also includes the step of: optimizing and inverting the fracturing operation parameters under engineering constraints based on the final production capacity prediction value verified by interpretability, with the goal of maximizing the estimated final recovery rate EUR, and outputting the fracturing parameter optimization scheme.

[0019] Secondly, a smart shale gas production capacity prediction system is provided for implementing the above method. Figure 4 A block diagram of the system of the present invention is shown, the system comprising:

[0020] The data acquisition and preprocessing module is used to perform the data acquisition and preprocessing steps.

[0021] The dynamic feature recalibration module is used to perform the step of inputting static features into the dynamic feature recalibration network and obtaining the recalibrated mechanism feature set;

[0022] The multi-model division of labor prediction module is used to perform the step of inputting the recalibrated mechanism feature set and production time series data into multiple machine learning prediction models respectively and obtaining preliminary capacity prediction values.

[0023] The dynamic fusion module is used to perform the step of inputting the recalibrated mechanism feature set into the gating network and obtaining the final production capacity prediction value;

[0024] An interpretability verification module is used to perform the steps of the interpretability verification.

[0025] Furthermore, the system also includes a fracturing optimization decision module, used to execute the steps of the output fracturing parameter optimization scheme.

[0026] Beneficial effects of this invention:

[0027] This invention introduces a dynamic feature recalibration network, enabling the weights of static features to adaptively adjust according to formation conditions, thus solving the problem that traditional static weighting of mechanistic features cannot adapt to geological heterogeneity. Through multi-model collaborative modeling, multi-scale influencing factors of shale gas production capacity are specifically captured by models such as TCN, LSTM, Transformer, and XGBoost, improving the model's ability to characterize complex physical processes. By dynamically generating fusion weights through a gating network, adaptive prediction based on formation characteristics is achieved, offering greater flexibility and accuracy than fixed-weight fusion. Furthermore, this invention designs a multi-level interpretability verification method, revealing the model's prediction logic from dimensions such as feature weights, fusion weights, and attention weights, verifying its consistency with geological engineering laws, and significantly enhancing the credibility and engineering application value of the prediction results. The final production capacity prediction results can be directly used for fracturing parameter optimization and development scheme adjustment. Attached Figure Description

[0028] Figure 1 : Flowchart of the method of the present invention;

[0029] Figure 2 Schematic diagram of the dynamic feature recalibration module;

[0030] Figure 3 Schematic diagram of dynamic fusion gating network structure;

[0031] Figure 4 : Block diagram of the system of the present invention. Detailed Implementation

[0032] The preferred embodiments of the present invention will be described below with reference to the accompanying drawings. It should be understood that the preferred embodiments described herein are...

[0033] Examples are provided for illustration and explanation only and are not intended to limit the scope of the invention.

[0034] This embodiment takes a shale gas platform in a certain block as an example, and selects 12 production wells that have been in production for 24 months and have complete logging, fracturing and production data.

[0035] Step 1: In one embodiment of the present invention, geological and engineering data are collected and preprocessed to obtain preprocessed geological and engineering data, including:

[0036] Missing values ​​are handled in the data. For continuous parameters such as permeability, the geometric mean of other wells within the same sub-layer is used to fill the missing values. For example, if the permeability of a certain well in the XX layer is missing, and the permeabilities of the other 5 wells in the same layer are 0.32, 0.28, 0.41, 0.35, and 0.30 mD, then the filling value is (0.32 × 0.28 × 0.41 × 0.35 × 0.30)^(1 / 5) = 0.33 mD.

[0037] The data is standardized, and continuous parameters are standardized using Z-scores. The mean and standard deviation are calculated based on data from neighboring wells. Taking TOC as an example, the TOC mean of 12 wells is 3.2%, and the standard deviation is 0.8%. Therefore, the original TOC value of 4.5% is standardized to (4.5-3.2) / 0.8=1.625. The standardization coefficients are saved for target well processing.

[0038] The time-series data were aligned, and the production capacity data was statistically analyzed monthly for a total of 24 months; the construction displacement data was collected minute by minute and resampled to the daily average value, for a total of 720 time points.

[0039] The static feature matrix is ​​obtained after preprocessing. Temporal feature tensor Label vector Final recovery rate (EUR).

[0040] The effects of the above-mentioned pre-technical solutions are as follows: data integrity is ensured by filling in missing values, dimensional differences are eliminated by standardization, and data format is unified by temporal alignment, providing high-quality input for subsequent modeling.

[0041] Table 1 shows the characteristic parameters collected from well A in this block for various dimensions:

[0042]

[0043] Step 2, one embodiment of the present invention, adjusts the feature weights corresponding to the geological and engineering data based on the data association strength of the current geological and engineering data, including:

[0044] A dynamic feature recalibration network is constructed to adaptively weight 7-dimensional static features. The network structure is as follows: 7 nodes in the input layer, 2 nodes in the hidden layer, and 7 nodes in the output layer; the activation function of the hidden layer is ReLU, and the activation function of the output layer is Sigmoid. The calculation process is as follows:

[0045]

[0046]

[0047] in, The standardized static feature vector; , , , For network parameters; For the Sigmoid function; This indicates element-wise multiplication; It is a dynamic weight vector; These are the recalibrated feature vectors.

[0048] To make the initial weights close to 1, Each element is initialized to 1.0.

[0049] In this embodiment, the standardized feature vector of a certain well

[0050] The weight vector is obtained after network calculation. ;

[0051] Recalibrated features .

[0052] The network is jointly trained with the subsequent prediction model using the Adam optimizer with an initial learning rate of 0.001, a weight decay of 1e-5, and 200 training epochs. An early stopping mechanism is set to stop training if the validation set loss does not decrease for 10 consecutive epochs.

[0053] The above technical solution achieves the following effects: Through dynamic feature recalibration, the sample-adaptive adjustment of feature weights is realized. In wells with high TOC, the TOC feature weight is automatically increased; in wells with high brittleness, the brittle feature weight is automatically increased. This allows subsequent models to focus on different features for different geological conditions, avoiding the "one-size-fits-all" shortcomings of traditional static weighting methods. Simultaneously, the recalibrated features retain all the information of the original features, only adjusting their contribution, achieving soft feature weighting, which is more adaptable to geological heterogeneity than hard feature selection.

[0054] Step 3: In one embodiment of the present invention, four parallel prediction models are constructed to capture the impact of different physical processes on production capacity. These include:

[0055] This study focuses on TCN networks designed to capture long-term declining trends in production capacity. The network structure is as follows: the input layer receives temporal features. After three layers of causal convolution with kernel size of 3 and dilation coefficients of 1, 2, and 4, and 64 channels per layer, a global average pooling layer is applied, followed by a fully connected layer (64→32→1) to output the predicted value. In this embodiment, a certain well is predicted by the TCN network. =2.05×10 8 m³.

[0056] This study focuses on LSTM networks used to capture dynamic pressure propagation. The network structure is as follows: the input layer receives temporal features. After processing with two LSTM layers, the hidden layer dimension is 64, and dropout=0.2; the hidden state at the last time step is taken, passed through a fully connected layer 64→32→1, and the predicted value is output. In this embodiment, the well is predicted by an LSTM network. =1.92×10 8 m³.

[0057] For Transformer networks, this method is used to capture inter-well interference. It involves recalibrating the target well and its three nearest neighboring wells using their recalibrated features. The data is organized as a sequence, and after adding learnable positional embeddings, it is input into a Transformer encoder. The encoder has two layers, each with four attention heads, a feedforward network dimension of 128, and dropout=0.1. The output vector corresponding to the target well position is taken, passed through a fully connected layer 7→32→1, and the predicted value is output. In this embodiment, the well was predicted by the Transformer network. =2.05×10 8 m³.

[0058] This is for the XGBoost model, used to capture non-linear interactions between features. Input features include: recalibrated features. (7-dimensional), constructing interactive features (TOC×brittleness, porosity×TOC, brittleness×Young's modulus, 3-dimensional), and statistical features (average EUR of adjacent wells, average porosity of small layers, 2-dimensional), for a total of 12 dimensions. Model parameters: number of trees 200, maximum depth 6, learning rate 0.05, subsampling ratio 0.8, feature sampling ratio 0.8. Independent training using 5-fold cross-validation is employed to output predicted values. In this embodiment, the well was predicted by XGBoost. =2.01×10 8 m³.

[0059] The advantages of the above technical solution are as follows: By employing multiple models for specialized modeling, the complex problem of shale gas production prediction is decoupled into several sub-problems, each handled by the most suitable model. TCN excels at capturing long-term trends, LSTM excels at handling dynamic changes, Transformer excels at modeling multi-well correlations, and XGBoost excels at handling nonlinear mappings. Each model complements the others, overcoming the limitations of a single model in characterizing multi-scale physical processes. Furthermore, the outputs of each model have clear physical meaning, laying the foundation for subsequent interpretability analysis.

[0060] Step 4: In one embodiment of the present invention, a gating network is constructed to dynamically generate the fusion weights of the four models. This includes:

[0061] The gated network structure is as follows: the input layer receives recalibration features. The hidden layer has 3 nodes (ReLU activation), and the output layer has 4 nodes (Softmax activation). The calculation process is as follows:

[0062]

[0063]

[0064] in , , , These are network parameters.

[0065] Training employs a two-stage strategy: In the first stage (0-100 epochs), the parameters of TCN, LSTM, and Transformer are fixed, and only the gating network and recalibration module are trained; in the second stage (101-200 epochs), all parameters are jointly fine-tuned, and the learning rate is reduced to 0.0001. The optimizer is Adam, and the loss function is MSE plus a 0.01-fold L2 regularization term.

[0066] The effects of the above steps are as follows: by dynamically generating fusion weights based on stratigraphic characteristics through a gating network, adaptive fusion that is "tailored to local conditions" is achieved. Compared to fixed-weight fusion, dynamic fusion can automatically adjust the contribution of each model according to different geological conditions. In high TOC areas, it may rely more on LSTM, while in highly brittle areas, it may rely more on TCN, thereby improving prediction accuracy. At the same time, the fusion weights themselves are interpretable, and the controlling factors under different geological conditions can be revealed by analyzing the relationship between the weights and geological parameters.

[0067] Step 5: In one embodiment of the present invention, the Spearman rank correlation coefficient is used to quantify the correlation between the weights output by the dynamic feature recalibration network and geological parameters, as well as the correlation between the fusion weights output by the gating network and geological parameters. This includes:

[0068] Extract the recalibration weights of all wells and calculate their Spearman rank correlation coefficients with the corresponding geological parameters. The formula for calculating the Spearman rank correlation coefficient is as follows:

[0069]

[0070] in, Let be the rank difference between the geological parameter value and the corresponding weight value of the i-th sample. The number of samples is given. In this embodiment, the correlation coefficient between TOC and its weights is 0.73 (p<0.01), indicating that the higher the TOC, the greater the weight assigned by the model; the correlation coefficient between the brittleness index and its weights is 0.21 (p>0.05), indicating that brittleness is not a major controlling factor. This result is consistent with the geological understanding of the block.

[0071] Using the Spearman rank correlation coefficient formula mentioned above, the correlation coefficient between TOC and LSTM weights was calculated to be -0.52 (p<0.05), indicating that in high TOC areas the model relies more on TCN than LSTM, which is consistent with shale gas development experience.

[0072] The multi-head attention weight matrix of the Transformer was extracted to construct a heatmap of inter-well correlation. In this embodiment, the mean attention weight between the target well and its neighboring wells was 0.38, significantly higher than the average of 0.12. Geological data confirmed the existence of a natural fracture zone between the two wells, and their production fluctuations were synchronized (cross-correlation coefficient 0.65). The Spearman correlation coefficient between the attention weight and the distance between wells was -0.41 (p<0.05), demonstrating that the model learned the principle that "the closer the wells are, the greater their mutual influence."

[0073] The effects of the above steps are as follows: through triple interpretability verification, it is proven that the laws learned by the model are consistent with geological understanding, and it can reveal some unlabeled geological information (such as inter-well fracture connectivity). This not only enhances the credibility of the model, but also makes the model a tool to assist in geological understanding.

[0074] Step 6: In one embodiment of the present invention, based on the final production capacity prediction value verified through interpretability, and with the objective of maximizing the estimated final recovery rate (EUR), the fracturing operation parameters are optimized and inverted under engineering constraints to output an optimized fracturing parameter scheme. This includes:

[0075] To maximize the final predicted value To achieve the objective, a genetic algorithm is used to optimize fracturing parameters. The optimization problem can be expressed as:

[0076]

[0077] The constraints are:

[0078]

[0079]

[0080] in, The strength of the sand (t / m) is the added strength. The injection intensity is given in m³ / m, and the constraints correspond to the range of variable values ​​and the upper limit of total cost (unit: 10,000 yuan). Genetic algorithm parameters: population size 50, number of generations 100, crossover probability 0.8, mutation probability 0.1, elite retention 5.

[0081] The optimized recommended scheme is: sand addition intensity 2.4 t / m, injection intensity 18 m³ / m, and predicted EUR 2.25×10⁻⁶. 8m³. The simulation result, verified by a numerical simulator, is 2.21 × 10⁻⁶. 8 m³, relative error 1.8%.

[0082] The effect of the above steps is to combine interpretable predictive models with optimization algorithms to achieve closed-loop optimization of fracturing parameters. The optimization scheme has been verified by numerical simulation, demonstrating high reliability and providing a scientific basis for on-site construction.

[0083] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any simple modifications or substitutions based on the technical concept of the present invention, as long as they do not exceed the scope of the claims, should be included in the protection scope of the present invention.

Claims

1. A method for intelligent prediction of shale gas production capacity based on geological engineering mechanisms and multi-model fusion, characterized in that, Includes the following steps: Step S1: Obtain a standardized input dataset, which includes geological parameters, engineering parameters, and historical production data; Step S2: Construct a dynamic feature recalibration network to adaptively weight the static features in the input dataset and output the recalibrated mechanism feature set; Step S3: Input the recalibrated mechanism feature set and production time series data into multiple machine learning prediction models respectively, and obtain the preliminary production capacity prediction values ​​output by each model; Step S4: Input the recalibrated mechanism feature set into the gated network, dynamically generate the fusion weights of each machine learning prediction model, and perform weighted fusion on each preliminary production capacity prediction value to generate the final shale gas production capacity prediction value. Step S5: Verify the interpretability of the final shale gas production capacity prediction and analyze the consistency between the prediction results and geological engineering laws.

2. A method for constructing shale gas production capacity prediction features based on dynamic feature recalibration, characterized in that, include: Basic geological parameters and engineering parameters are input into a learnable feature recalibration network. The network outputs dynamic weights corresponding to each basic parameter through a dimensionality reduction-dimensionality increase structure and a nonlinear activation function. The dynamic weights are multiplied element-wise with the corresponding basic parameters to obtain the recalibrated mechanism feature vector. The computation process of the feature recalibration network is represented as follows: (1-1); (1-2); in, The input is the basic parameter vector; , Here are the weight matrix and biases for the first fully connected layer; , The weight matrix and bias of the second fully connected layer; It is a linear rectification activation function; Use the Sigmoid activation function; This indicates element-wise multiplication; The learned dynamic weight vector; This is the recalibrated mechanism feature vector.

3. A shale gas production capacity prediction method based on multi-model division of labor, characterized in that, include: Time series data is simultaneously input into the TCN network and the LSTM network. The TCN network is used to capture the long-term declining trend of production capacity, while the LSTM network is used to capture the dynamic changes caused by pressure propagation and production system adjustments. The static and temporal characteristics of multiple wells are organized into a multi-well data format and input into a Transformer network to capture inter-well interference and regional correlation features. Static features are input into the XGBoost model to capture the non-linear interaction relationships between features.

4. A dynamic fusion method for shale gas production capacity prediction based on gated networks, characterized in that, include: The static features of the current sample are input into a gating network, which consists of fully connected layers and a Softmax activation function, and outputs a set of normalized weight vectors. Each element in the weight vector corresponds to the contribution of a machine learning prediction model. The dynamic weight generation process of the gated network is represented as follows: ; in: The input is a geological feature vector; , Here are the weight matrix and biases for the first fully connected layer of the gated network; , The weight matrix and bias of the second fully connected layer of the gated network; It is a linear rectification activation function; The function is a normalized exponential function, ensuring that the sum of the weights is 1; This is the output weight vector; The final shale gas production capacity forecast is calculated using the following formula: ; in, Let be the weights of the i-th model; This is the initial prediction value for the i-th model; The final capacity forecast is denoted by n; n is the number of models (in this patent, n=4, corresponding to TCN, LSTM, Transformer, and XGBoost).

5. The method according to claim 4, characterized in that, The weight vector output by the gated network changes dynamically with the input geological features; By analyzing the correlation between the weight vector and key geological parameters, the importance of each model under different geological conditions can be revealed, including: the weight of the LSTM network is relatively high in high TOC reservoirs, and the weight of the TCN network is relatively high in highly brittle reservoirs.

6. A method for verifying the interpretability of shale gas production capacity prediction results, characterized in that, Includes at least one of the following: The dynamic weights output by the feature recalibration network are visualized and analyzed to verify whether their changing trends are consistent with geological understanding. Statistical analysis was performed on the fusion weights of the gated network output to verify whether their correlation with geological parameters was reasonable. The attention weights of the Transformer network are visualized to verify whether they reflect the actual inter-well interference relationships.

7. A multi-model fusion deep shale gas production capacity prediction system guided by geological engineering mechanisms, characterized in that, include: The data acquisition and preprocessing module is used to perform step S1 as described in claim 1; A dynamic feature recalibration module is used to perform step S2 as described in claim 1; A multi-model division of labor prediction module is used to perform step S3 as described in claim 1; An adaptive fusion control module is used to execute step S4 as described in claim 1; An interpretability verification module is used to perform step S5 as described in claim 1.

8. The system according to claim 7, characterized in that, The multi-model division of labor prediction module deploys multiple machine learning prediction models, including a TCN network submodule, an LSTM network submodule, a Transformer network submodule, and an XGBoost model submodule. The multi-model division of labor prediction module is configured to simultaneously route time series data to the TCN network submodule and the LSTM network submodule, route multi-well data to the Transformer network submodule, and route static features to the XGBoost model submodule.

9. The system according to claim 7, characterized in that, The adaptive fusion control module includes a gated network unit and a weighted fusion unit. The gated network unit takes the static features of the current sample as input and outputs the dynamic weights of each machine learning prediction model. The weighted fusion unit performs a weighted summation of the preliminary prediction values ​​output by each sub-module according to the dynamic weights to generate the final shale gas production capacity prediction value.

10. The system according to any one of claims 7-9, characterized in that, It also includes a fracturing optimization decision module, which receives the verified deep shale gas production capacity prediction results and, in combination with the estimated final recovery rate, outputs an optimized fracturing parameter scheme.