A shale gas well fracturing effect analysis method and system based on data driving

By employing a data-driven approach combining LightGBM feature sorting and K-means clustering, the problem of unreasonable fracturing parameter design in shale gas wells was solved. This enabled accurate assessment of geological conditions and improved production capacity in shale gas wells, optimized fracturing effects, and unleashed the potential of the formation.

CN117332668BActive Publication Date: 2026-06-12PETROCHINA CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
PETROCHINA CO LTD
Filing Date
2022-06-24
Publication Date
2026-06-12

Smart Images

  • Figure CN117332668B_ABST
    Figure CN117332668B_ABST
Patent Text Reader

Abstract

The application discloses a shale gas well fracturing effect analysis method and system based on data driving, which comprises the following steps: obtaining a plurality of geological engineering parameters according to data acquisition, processing the plurality of geological engineering parameters, and taking EUR as the determination data of natural gas production capacity; wherein the geological engineering parameters comprise geological characteristic parameters and fracturing characteristic parameters; analyzing the plurality of geological engineering parameters and EUR based on a Pearson algorithm, and eliminating redundant geological engineering parameters; calculating the influence weight of a plurality of influence factors according to a LightGBM algorithm, sorting the result of the influence weight, and analyzing the main control factors influencing EUR; wherein the influence factors are the eliminated geological engineering parameters; and utilizing a K-means clustering algorithm to perform productivity potential determination analysis on the eliminated geological engineering parameters. Through the analysis of the geological conditions of the shale gas well, the shale gas can be subjected to fracturing reconstruction, and the potential of the formation can be developed.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of shale gas exploration and development technology, and in particular to a data-driven method and system for analyzing the fracturing effect of shale gas wells. Background Technology

[0002] Shale gas is an important unconventional natural gas resource found in shale gas reservoirs, possessing enormous exploration potential. With the North American shale gas revolution and the continuous optimization of the energy structure, shale gas, as an unconventional oil and gas resource, has become one of the important pillars for increasing oil and gas reserves and production in my country and even globally. Shale gas reservoirs are characterized by low porosity, low permeability, and heterogeneity; the flow patterns within the shale gas reservoir matrix no longer follow Darcy flow. Due to its ultra-low porosity and permeability, horizontal drilling and multi-stage hydraulic fracturing technologies are required for shale gas extraction.

[0003] The ultimate recoverable reserves (EUR) of shale gas determine its development lifecycle and are one of the most critical parameters for achieving economic benefits. Shale gas well productivity is influenced by factors such as fracturing fluid energy utilization and the closure patterns of primary and induced fractures. The challenge of shale gas development lies in the inherent problems of multi-parameter decision-making, particularly the substantial uncertainties that may exist in subsurface conditions and key economic factors.

[0004] Shale gas well fracturing parameters are complex, and numerous factors influence single-well production. Due to differences in reservoir geological characteristics, the design of fracturing operation parameters is often unreasonable, failing to fully realize the formation's potential. Therefore, it is essential to assess the geological conditions of shale gas wells before fracturing them. Summary of the Invention

[0005] The purpose of this invention is to provide a data-driven method and system for analyzing the fracturing effect of shale gas wells. Based on the characteristics of shale gas well development, this invention assesses the geological conditions of shale gas wells using LightGBM feature ranking, feature construction, and K-means clustering, thus solving the inherent problem of multi-parameter decision-making in the challenges faced by shale gas development.

[0006] To achieve the above objectives, this invention provides a data-driven method for analyzing the fracturing effect of shale gas wells, comprising:

[0007] Multiple geological engineering parameters are acquired based on data collection, and these parameters are processed to use EUR as the data for determining natural gas production capacity; wherein, the geological engineering parameters include geological characteristic parameters and fracturing characteristic parameters;

[0008] The Pearson algorithm is used to analyze the multiple geological engineering parameters and EUR, and redundant geological engineering parameters are removed.

[0009] The influence weights of multiple influencing factors are calculated using the LightGBM algorithm, and the results of the influence weights are sorted and analyzed to determine the main controlling factors affecting EUR; wherein, the influencing factors are the geological engineering parameters after elimination.

[0010] The K-means clustering algorithm was used to analyze the productivity potential of the removed geological engineering parameters.

[0011] Furthermore, based on the data acquisition, multiple geological engineering parameters are obtained, and these parameters are processed. Specifically,

[0012] The missing data was handled by using the mean substitution method.

[0013] Furthermore, the missing data is processed using a mean substitution method, specifically as follows:

[0014] If the number of missing values ​​in the data is very small compared to the overall data, the missing values ​​can be deleted directly.

[0015] Furthermore, multiple address engineering parameters are acquired through data acquisition, and the parameter data is processed, specifically as follows:

[0016] The collected data is normalized.

[0017] Furthermore, the collected data undergoes normalization processing, specifically as follows:

[0018] The dimensional expression is transformed into a dimensionless expression through calculation, and the data is mapped to the range of 0 to 1 for processing.

[0019] Furthermore, the dimensional expression is transformed into a dimensionless expression through calculation, and the data is mapped to the range of 0 to 1 for processing. Specifically,

[0020]

[0021] Where Xnorm represents the data after normalization, X is the original data, Xmax represents the maximum value of the original dataset, and Xmin represents the minimum value of the original dataset.

[0022] Furthermore, based on the Pearson algorithm, correlation analysis was performed on multiple geological engineering parameters and EUR, and redundant geological engineering parameters were removed. Specifically,

[0023] If two geological engineering parameter data are identical, and the single-factor effects of the two geological engineering parameter data on EUR are different, then these geological engineering parameters shall be retained.

[0024] If two geological engineering parameter data are identical, and the single-factor influence of the two geological engineering parameter data on EUR is extremely similar, one of the geological engineering parameters will be eliminated during the analysis.

[0025] Furthermore, the influence weights of multiple influencing factors are calculated using the LightGBM algorithm, and the results of the influence weights are sorted to analyze the main controlling factors affecting EUR. Specifically,

[0026] According to the LightGBM algorithm, the standardized EUR and the removed geological engineering parameters are used as input parameters to calculate the influence weights of multiple influencing factors; wherein, the influencing factors are the removed geological engineering parameters.

[0027] The calculation results of the influencing weights are sorted, and the various geological engineering parameters after elimination are classified according to geological characteristic parameters and fracturing characteristic parameters. The sum of geological characteristic parameters and the sum of fracturing characteristic parameters are compared to determine the main controlling factors affecting EUR.

[0028] Furthermore, the K-means clustering algorithm is used to analyze the productivity potential of the removed geological engineering parameters. Specifically,

[0029] Based on feature construction, the removed address engineering parameters are compressed into two dimensions according to feature attributes, and three-dimensional data is established using EUR for cluster analysis; feature construction refers to manually constructing new features from the original data;

[0030] The model parameters were tuned, and a sensitivity analysis was performed on the number of clusters. The clustering results were obtained and analyzed to determine the optimal number of clusters.

[0031] The model is used to determine the geological potential of shale gas wells and to provide the range of variation of geological and fracturing characteristic parameters.

[0032] Furthermore, the model parameters were fine-tuned, and a sensitivity analysis was performed on the number of clusters. The clustering results were obtained and analyzed to determine the optimal number of clusters. Specifically,

[0033] The model parameters were tuned, and the silhouette coefficient (SC) and Kalinsky-Hallabas index (CH) were used to perform sensitivity analysis on the number of clusters. The clustering results were obtained and analyzed to find the optimal number of clusters.

[0034] This invention also provides a data-driven shale gas well fracturing effect analysis system, comprising:

[0035] The data acquisition and processing unit acquires multiple geological engineering parameters through data collection, processes these parameters, and uses EUR as the data for determining natural gas production capacity; wherein, the geological engineering parameters include geological characteristic parameters and fracturing characteristic parameters;

[0036] The elimination unit analyzes the multiple geological engineering parameters and EUR using the Pearson algorithm and eliminates redundant geological engineering parameters;

[0037] The analysis unit calculates the influence weights of multiple influencing factors using the LightGBM algorithm, sorts the results of the influence weights, and analyzes the main controlling factors affecting EUR; wherein, the influencing factors are geological engineering parameters after elimination.

[0038] The determination unit uses the K-means clustering algorithm to determine the productivity potential of the removed geological engineering parameters.

[0039] Furthermore, the acquisition and processing unit acquires multiple geological engineering parameters through data collection and processes these parameters, specifically:

[0040] The acquisition and processing unit uses the mean substitution method to process the missing data, and the acquisition and processing unit also performs normalization processing on the collected data.

[0041] Furthermore, the analysis unit calculates the influence weights of multiple influencing factors using the LightGBM algorithm, and sorts the results of the influence weights to analyze the main controlling factors affecting EUR. Specifically,

[0042] The analysis unit uses the LightGBM algorithm to calculate the influence weights of multiple influencing factors, taking the standardized EUR and the removed geological engineering parameters as input parameters; wherein, the influencing factors are the removed geological engineering parameters.

[0043] The calculation results of the influencing weights are sorted, and the various geological engineering parameters after elimination are classified according to geological characteristic parameters and fracturing characteristic parameters. The sum of geological characteristic parameters and the sum of fracturing characteristic parameters are compared to determine the main controlling factors affecting EUR.

[0044] Furthermore, the determination unit uses the K-means clustering algorithm to determine the productivity potential of the removed geological engineering parameters. Specifically,

[0045] The determination unit compresses the removed address engineering parameters into two dimensions according to feature attributes through feature construction, and uses EUR to establish a three-dimensional data for cluster analysis; feature construction refers to manually constructing new features from the original data;

[0046] The model parameters were tuned, and a sensitivity analysis was performed on the number of clusters. The clustering results were obtained and analyzed to determine the optimal number of clusters.

[0047] The model is used to determine the geological potential of shale gas wells and to provide the range of variation of geological and fracturing characteristic parameters.

[0048] The technical effects and advantages of this invention are as follows: By analyzing and determining the geological conditions of shale gas wells based on the characteristics of shale gas well extraction, using LightGBM feature ranking, feature construction, and K-means clustering, the geological conditions of shale gas wells can be optimized. Through this analysis of the geological conditions of shale gas wells, fracturing and stimulation of shale gas formations can be achieved, thus unlocking the potential of the formation.

[0049] Other features and advantages of the invention will be set forth in the description which follows, and will be apparent in part from the description, or may be learned by practicing the invention. The objects and other advantages of the invention may be realized and obtained by means of the structures pointed out in the description, claims and drawings. Attached Figure Description

[0050] Figure 1 This is a flowchart of a data-driven shale gas well fracturing effect analysis method according to an embodiment of the present invention;

[0051] Figure 2 This is a diagram showing the influence of various geological engineering parameter data values ​​on EUR in the embodiments of the present invention;

[0052] Figure 3 This is a histogram showing the weighted values ​​of the geological engineering parameters on the EUR in this embodiment of the invention.

[0053] Figure 4 This is the 3D clustering diagram when the optimal number of clusters k is 4 in this embodiment of the invention;

[0054] Figure 5a This is the clustering diagram when the optimal number of clusters k is 4 in this embodiment of the invention;

[0055] Figure 5b This is the clustering diagram when the optimal number of clusters k is 4 in this embodiment of the invention;

[0056] Figure 5c This is the clustering diagram when the optimal number of clusters k is 4 in this embodiment of the invention;

[0057] Figure 6 This is a schematic diagram of a data-driven shale gas well fracturing effect analysis system according to an embodiment of the present invention. Detailed Implementation

[0058] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0059] To address the shortcomings of existing technologies, this invention discloses a data-driven method and system for analyzing the fracturing effect of shale gas wells.

[0060] Example 1

[0061] This embodiment takes Weiyuan District as an example. This invention provides a data-driven method for analyzing the fracturing effect of shale gas wells, such as... Figure 1 As shown, the specific steps include the following:

[0062] Multiple geological and engineering parameters were collected from various sources and processed to determine natural gas production capacity using EUR (Estimated Ultimate Recovery).

[0063] Among them, multiple geological engineering parameters include 7 geological characteristic parameters and 8 fracturing characteristic parameters, as shown in Table 1:

[0064] Table 1 Geological Engineering Parameters

[0065] Serial Number parameter unit 1 Vertical depth m 2 TOC content % 3 Porosity % 4 1 small layer thickness m 5 gas saturation % 6 pressure coefficient / 7 Brittle mineral content % 8 Horizontal section length m 9 Fracturing section length m 10 Type I reservoir drilling length m 11 Type I reservoir drilling rate % 12 Average segment spacing m 13 Liquid strength <![CDATA[m 3 / m]]> 14 Sand strength t / m 15 Average displacement <![CDATA[m 3 / min <!-- 4 -->]]> 16 EUR /

[0066] The missing data in the geological engineering parameters were processed using the mean substitution method.

[0067] Specifically, when the number of missing values ​​in the data represents a very small percentage of the overall data, the missing values ​​(rows) can be directly deleted. However, if the number of missing values ​​represents a large percentage of the overall data, this method of directly deleting missing values ​​will result in the loss of important information such as the mean.

[0068] The collected geological engineering parameter data were normalized.

[0069] Due to differences in thresholds and units, different parameter values ​​need to be normalized. Normalization is a method to simplify calculations, that is, transforming dimensional expressions into dimensionless expressions, mapping the data to the range of 0 to 1; it is obtained through the following formula:

[0070]

[0071] Where Xnorm represents the data after normalization, X is the original data, Xmax represents the maximum value of the original dataset, and Xmin represents the minimum value of the original dataset.

[0072] The Pearson algorithm was used to perform correlation analysis on the 15 geological engineering parameters and EUR, and redundant geological engineering parameters were removed.

[0073] Specifically, when two geological engineering parameter data are identical, if the single-factor effects of the two geological engineering parameter data on EUR are different, these geological engineering parameters are retained; when two geological engineering parameter data are identical, if the single-factor effects of the two geological engineering parameter data on EUR are extremely similar, one of the geological engineering parameters is removed from the analysis.

[0074] Depend on Figure 2 It can be seen that the data values ​​of the 15 geological engineering parameters have different effects on EUR. The correlation values ​​of pressure coefficient and vertical depth on EUR are both 0.86; the correlation values ​​of drilling length and drilling rate of Class I reservoir on EUR are both 0.92. However, since the individual factors of these geological engineering parameters have different effects on EUR, they are all retained. The correlation values ​​of horizontal section length and fracture section length on EUR are both 0.8, and the individual factors of these two geological engineering parameters have very similar effects on EUR. After analysis, the horizontal section length parameter is removed.

[0075] The method of quantifying the main controlling factors of the Weiyuan block's production capacity based on the LightGBM (Light Gradient Boosting Machine, a distributed gradient boosting framework based on decision tree algorithm) feature ranking method includes calculating the influence weights of multiple influencing factors according to the LightGBM algorithm, and ranking the results of the influence weights to analyze the main controlling factors affecting EUR; wherein, the influencing factors are 14 geological engineering parameters after elimination.

[0076] Specifically, according to the LightGBM algorithm, the standardized EUR and the removed geological engineering parameters are used as input parameters to calculate the influence weights of multiple influencing factors; wherein, the influencing factors are the 14 removed geological engineering parameters.

[0077] The calculation results of the influencing weights are sorted, and the various geological engineering parameters after elimination are classified according to geological characteristic parameters and fracturing characteristic parameters. The sum of geological characteristic parameters and the sum of fracturing characteristic parameters are compared to determine the main controlling factors affecting EUR.

[0078] like Figure 3As shown, the weight values ​​of various geological engineering parameters on EUR can be obtained. Based on the sum of the weights of each geological engineering parameter according to geological characteristic parameters and fracturing characteristic parameters, the total weight of geological factors is 0.5, the total weight of fracturing factors is 0.5, and the sum of geological characteristic parameters is equal to the sum of fracturing characteristic parameters. Therefore, it can be concluded that geological conditions and fracturing conditions jointly determine the EUR of shale gas wells in this block.

[0079] The K-means clustering algorithm was used to analyze the productivity potential of the removed geological engineering parameters.

[0080] The basic idea of ​​K-means clustering is to minimize the clustering performance index. The clustering criterion function used is the sum of the squared distances from each sample point in the cluster to the center of the cluster, and this summation is minimized.

[0081] Specifically, based on the feature construction, the 14 removed address engineering parameters are compressed into a two-dimensional form according to their feature attributes.

[0082] Feature construction refers to artificially creating new features from the original data. This includes creating new features using mixed or combined attributes, or decomposing and segmenting the original features to create new ones. Let the sample set have N samples, M features, and K feature attributes (M≤K), then the samples can represent X. M×N = [x1, x2, ..., x K ] T ,in m K M represents the number of features belonging to the Kth feature attribute, where M = m1 + m2 + ... + m K Let the weight matrix of the sample features be W = [w1, w2, ..., w...]. K ] T , And w1+w2+…w K =1.

[0083] New sample matrix C N×K = [c1, c2, ..., c K This can be represented as:

[0084]

[0085] By constructing features, the 14 features are compressed into two dimensions according to their feature attributes, and three-dimensional data is established using EUR for cluster analysis.

[0086] As shown in Table 2, the model parameters were tuned.

[0087] Table 2 Model Parameter Tuning Table

[0088]

[0089]

[0090] Then, a sensitivity analysis was performed on the number of clusters to obtain the clustering results and analyze them to find the optimal number of clusters.

[0091] Specifically, two internal evaluation methods were used: the silhouette coefficient (SC) and the Kalinsky-Hallabas index (CH) to perform sensitivity analysis on the number of clusters. For example... Figure 4 As shown in Figure 5, the optimal number of clusters k obtained from the clustering results analysis is 4: ① Good geological conditions, high EUR; ② Good geological conditions, low EUR; ③ Moderate geological conditions, low EUR; ④ Average geological conditions, low EUR.

[0092] The geological potential of shale gas wells was evaluated using a model, and the range of variation of geological and fracturing characteristic parameters is shown in Table 3.

[0093] Table 3 Parameter Variation Range

[0094]

[0095] Example 2

[0096] This invention also provides a data-driven shale gas well fracturing effect analysis system, the specific process of which is as follows: Figure 6 As shown, it includes: an acquisition and processing unit, a rejection unit, an analysis unit, and a decision unit.

[0097] The acquisition and processing unit is used to acquire multiple geological engineering parameters through data collection, process the multiple geological engineering parameters, and use EUR as the data for determining natural gas production capacity; wherein, the geological engineering parameters include geological characteristic parameters and fracturing characteristic parameters.

[0098] Specifically, the acquisition and processing unit is also used to process missing data by means substitution, and the acquisition and processing unit also performs normalization processing on the collected data.

[0099] The elimination unit is used to analyze the multiple geological engineering parameters and EUR using the Pearson algorithm and eliminate redundant geological engineering parameters.

[0100] The analysis unit is used to calculate the influence weights of multiple influencing factors using the LightGBM algorithm, and to sort and analyze the results of the influence weights to determine the main controlling factors affecting EUR; wherein, the influencing factors are geological engineering parameters after being removed.

[0101] Specifically, the analysis unit uses the LightGBM algorithm to take the standardized EUR and the removed geological engineering parameters as input parameters to calculate the influence weights of multiple influencing factors; wherein, the influencing factors are the removed geological engineering parameters; the calculation results of the influence weights are sorted, and the removed geological engineering parameters are classified according to geological characteristic parameters and fracturing characteristic parameters, and the sum of geological characteristic parameters is compared with the sum of fracturing characteristic parameters to obtain the main controlling factors affecting EUR.

[0102] The determination unit is used to determine the productivity potential of the removed geological engineering parameters using the K-means clustering algorithm.

[0103] Specifically, the judgment unit is used to compress the removed geological engineering parameters into two dimensions according to their characteristic attributes through feature construction, and to establish a three-dimensional data using EUR for cluster analysis. Feature construction refers to artificially constructing new features from the original data. The model parameters are optimized, and the number of clusters is subjected to sensitivity analysis to obtain clustering results and analysis to obtain the optimal number of clusters. The model is used to determine the geological potential of shale gas wells and to give the range of variation of geological features and fracturing feature parameters.

[0104] Regarding the system in the above embodiments, the specific manner in which each unit performs operations has been described in detail in the embodiments related to the method, and will not be elaborated here.

[0105] In the method described in this embodiment, the geological conditions of shale gas wells are analyzed and determined based on LightGBM feature ranking, feature construction, and K-means clustering, according to the characteristics of shale gas well extraction. By assessing the geological conditions of shale gas wells, fracturing and stimulation of the shale gas formation can be achieved, thus unlocking the formation's potential.

[0106] Finally, it should be noted that the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A data-driven method for analyzing the fracturing effect of shale gas wells, characterized in that, include, Multiple geological engineering parameters are acquired based on data collection, and these parameters are processed to use EUR as the data for determining natural gas production capacity; wherein, the geological engineering parameters include geological characteristic parameters and fracturing characteristic parameters; The Pearson algorithm is used to analyze the multiple geological engineering parameters and EUR, and redundant geological engineering parameters are eliminated. This includes: when two geological engineering parameter data are the same, if the single-factor influence of the two geological engineering parameter data on EUR is different, these geological engineering parameters are retained; when two geological engineering parameter data are the same, if the single-factor influence of the two geological engineering parameter data on EUR is extremely close, one of the geological engineering parameters is eliminated. The influence weights of multiple influencing factors are calculated using the LightGBM algorithm. The results of the influence weights are then sorted and the main controlling factors affecting EUR are analyzed. This includes: using the standardized EUR and the removed geological engineering parameters as input parameters, according to the LightGBM algorithm, to calculate the influence weights of multiple influencing factors; wherein, the influencing factors are the removed geological engineering parameters; the results of the influence weight calculations are sorted, and the removed geological engineering parameters are classified according to geological characteristic parameters and fracturing characteristic parameters. The sum of the geological characteristic parameters is compared with the sum of the fracturing characteristic parameters to determine the main controlling factors affecting EUR. The K-means clustering algorithm is used to analyze the productivity potential of the removed geological engineering parameters. This includes: compressing the removed geological engineering parameters into a two-dimensional dataset based on their characteristic attributes, and then using EUR to create a three-dimensional dataset for cluster analysis; where feature construction refers to manually constructing new features from the original data; optimizing the model parameters, using the silhouette coefficient and Kalinsky-Hallabus index to perform sensitivity analysis on the number of clusters, obtaining and analyzing the clustering results to determine the optimal number of clusters; and using the model to determine the geological potential of shale gas wells, providing the range of variation for geological and fracturing characteristic parameters. This involves processing multiple geological engineering parameters, including using mean substitution to process missing data.

2. The data-driven shale gas well fracturing effect analysis method according to claim 1, characterized in that, The missing data in multiple geological engineering parameters were processed using a mean substitution method. Specifically, If the number of missing values ​​in the data is very small compared to the overall data, the missing values ​​can be deleted directly.

3. The data-driven shale gas well fracturing effect analysis method according to claim 1, characterized in that, Multiple address engineering parameters were acquired through data acquisition, and the parameter data was processed, specifically as follows: The collected data is normalized.

4. The data-driven shale gas well fracturing effect analysis method according to claim 3, characterized in that, The collected data is normalized, specifically as follows: The dimensional expression is transformed into a dimensionless expression through calculation, and the data is mapped to the range of 0 to 1 for processing.

5. The data-driven shale gas well fracturing effect analysis method according to claim 4, characterized in that, The dimensional expression is transformed into a dimensionless expression through calculation, and the data is mapped to the range of 0 to 1 for processing. Specifically, (1) in, Xnorm This represents the data after normalization. X This is the original data. Xmax This represents the maximum value of the original dataset. Xmin This represents the minimum value in the original dataset.

6. A data-driven shale gas well fracturing effect analysis system, characterized in that, include, The acquisition and processing unit is used to acquire multiple geological engineering parameters through data acquisition, process the multiple geological engineering parameters, and use EUR as the data for determining natural gas production capacity; wherein, the geological engineering parameters include geological characteristic parameters and fracturing characteristic parameters; The elimination unit is used to analyze the multiple geological engineering parameters and EUR using the Pearson algorithm and eliminate redundant geological engineering parameters; The analysis unit is used to calculate the influence weights of multiple influencing factors using the LightGBM algorithm, and to sort and analyze the results of the influence weights to determine the main controlling factors affecting EUR; wherein, the influencing factors are geological engineering parameters after being removed. The determination unit is used to determine the productivity potential of the removed geological engineering parameters using the K-means clustering algorithm. This involves processing multiple geological engineering parameters, including using mean substitution to process missing data; The Pearson algorithm is used to analyze the multiple geological engineering parameters and EUR, and redundant geological engineering parameters are eliminated. This includes: when two geological engineering parameter data are the same, if the single-factor influence of the two geological engineering parameter data on EUR is different, these geological engineering parameters are retained; when two geological engineering parameter data are the same, if the single-factor influence of the two geological engineering parameter data on EUR is extremely close, one of the geological engineering parameters is eliminated. The influence weights of multiple influencing factors are calculated using the LightGBM algorithm. The results of the influence weights are then sorted and the main controlling factors affecting the EUR are analyzed. This includes: using the standardized EUR and the removed geological engineering parameters as input parameters, according to the LightGBM algorithm, to calculate the influence weights of multiple influencing factors; sorting the results of the influence weight calculations; classifying the removed geological engineering parameters according to geological characteristic parameters and fracturing characteristic parameters; and comparing the sum of geological characteristic parameters with the sum of fracturing characteristic parameters to determine the main controlling factors affecting the EUR. The K-means clustering algorithm is used to analyze the productivity potential of the removed geological engineering parameters. This includes: compressing the removed geological engineering parameters into a two-dimensional dataset based on their characteristic attributes, and then using EUR to create a three-dimensional dataset for cluster analysis; where feature construction refers to artificially constructing new features from the original data; optimizing the model parameters, using the silhouette coefficient and Kalinsky-Hallabas index to perform sensitivity analysis on the number of clusters, obtaining and analyzing the clustering results to determine the optimal number of clusters; and using the model to determine the geological potential of shale gas wells, providing the range of variation for geological and fracturing characteristic parameters.