A power load prediction method and system based on correlation of external influencing factors

By combining an optimized k-means clustering model with a correlation analysis method that integrates standard mutual information and convolutional neural network feature extraction, the problem of insufficient nonlinear correlation analysis between historical power load data and external influencing factors is solved, thereby improving the accuracy and stability of short-term power load forecasting.

CN115526420BActive Publication Date: 2026-06-19NINGBO ELECTRIC POWER DESIGN INST +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NINGBO ELECTRIC POWER DESIGN INST
Filing Date
2022-10-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies cannot accurately analyze the nonlinear correlation between historical power load data and external influencing factors, leading to inaccurate short-term power load forecasts within the industry.

Method used

An optimized k-means clustering model and a standard mutual information-based correlation analysis method are used, combined with convolutional neural networks for feature extraction, and then input into a support vector regression model for prediction.

🎯Benefits of technology

It improves the accuracy and stability of short-term power load forecasting in the industry and accurately analyzes the nonlinear correlation between historical power load data and external influencing factors.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115526420B_ABST
    Figure CN115526420B_ABST
Patent Text Reader

Abstract

This application discloses a power load forecasting method based on the correlation of external influencing factors. The method includes: acquiring raw data; performing a correlation analysis based on a k-means clustering model optimized by the average profile coefficient and a standard mutual information-based method to quantify the nonlinear correlation between historical power load data and external influencing factor data contained in the raw data, obtaining the correlation analysis results; and inputting the correlation analysis results, the feature vectors of historical power load data extracted using a convolutional neural network, and the feature vectors of external influencing factor data into a pre-trained industry power load forecasting model to obtain the current industry power load forecast value. This application accurately quantifies the nonlinear correlation between historical power load data and external influencing factor data, and applies the correlation analysis results to power load forecasting, effectively improving the accuracy and stability of short-term power load forecasting within the industry.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of power load forecasting, and in particular to a power load forecasting method and system based on the correlation of external influencing factors. Background Technology

[0002] Short-term power load forecasting for an industry is a crucial basis for dispatch planning and is of great significance for the safe and economical operation of the power system. Numerous studies have been conducted on short-term power load forecasting within an industry. Examples include: decomposing power load into low-frequency and high-frequency components using ensemble empirical mode decomposition, and then using linear regression and neural networks to forecast the low-frequency and high-frequency components respectively; combining the predictions of Long Short-Term Memory (LSTM) networks and Extreme Gradient Boosting (XGBoost) models for load forecasting, and then using the inverse error method to combine the two predictions to obtain the final forecast result; clustering user load data to obtain user groups with different load characteristics, constructing load forecasting models for different user groups, and finally integrating the predicted loads of each user group into a global forecast result; and using convolutional neural networks and recurrent neural networks to learn and extract features from historical load sequences, and introducing an attention mechanism to improve the accuracy of short-term load forecasting.

[0003] However, the aforementioned studies only utilize historical power load data for short-term power load forecasting within the industry. Power load not only has an inherent correlation with historical power load data but is also significantly influenced by external factors such as weather and day type. Therefore, it is necessary to consider these external factors to improve the accuracy of load forecasting. With the accumulation of diverse data such as historical power load data and meteorological data across various industries, analyzing the load characteristics and external influencing factors using big data and machine learning technologies will help to more refined the forecasting of loads of different natures, further improving the accuracy of short-term power load forecasting. Therefore, some studies have considered the correlation between external influencing factor data and historical power load data when conducting short-term load forecasting. For example, Pearson correlation coefficient is used to analyze the correlation between power load and external influencing factors, providing a basis for selecting load similarity days. However, Pearson correlation coefficient is only applicable to the analysis of linear correlation and cannot capture the nonlinear correlation between power load and external influencing factors. Copula function is used to measure the nonlinear correlation between power load and external influencing factors, but the specific form of Copula function needs to be determined manually, and the accuracy of correlation analysis is easily affected by subjective factors. Machine learning methods are used to automatically extract the nonlinear correlation between power load and external influencing factors, but machine learning methods have poor interpretability and are easily affected by random fluctuations of variables, leading to overfitting, when there is not a sufficiently large amount of data.

[0004] Existing technologies cannot accurately analyze the correlation between historical power load data and external influencing factor data, thus making it impossible to accurately predict short-term power load within the industry. Summary of the Invention

[0005] In view of this, this application provides a power load forecasting method and system based on the correlation of external influencing factors, to solve the problem in the prior art that the correlation between historical power load data and external influencing factor data cannot be accurately analyzed, resulting in the inability to accurately predict short-term power load within the industry. The specific solution is as follows:

[0006] Firstly, this application provides a power load forecasting method based on the correlation of external influencing factors, including:

[0007] Obtain raw data, which includes historical power load data within a historical time period and data on external factors affecting power load forecasting;

[0008] Based on an optimized k-means clustering model and a correlation analysis method combining standard mutual information, the nonlinear correlation between the historical power load data and the external influencing factor data is quantitatively analyzed to obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient.

[0009] A convolutional neural network is used to extract features from the historical power load data and the external influencing factor data to obtain feature vectors for the historical power load data and external influencing factor data.

[0010] The historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results are input into the pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment.

[0011] In one possible implementation, the correlation analysis method based on an optimized k-means clustering model and standard mutual information is used to quantify the nonlinear correlation between the historical electricity load data and the external influencing factor data, obtaining correlation analysis results, including:

[0012] Based on the optimized k-means clustering model, the historical power load data and the external influencing factor data are clustered respectively to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0013] The sampled values ​​of multiple clusters of the historical power load data are replaced with the cluster center values ​​corresponding to the multiple clusters of the historical power load data to obtain the replaced historical power load data;

[0014] The sampled values ​​of multiple clusters of the external influencing factor data are replaced with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data to obtain the replaced external influencing factor data;

[0015] The standard mutual information is calculated using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

[0016] In one possible implementation, the optimized k-means clustering model performs clustering processing on the historical power load data and the external influencing factor data respectively, obtaining the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data, including:

[0017] Input the historical power load data and the sampled values ​​of the external influencing factors data within the historical time period;

[0018] Perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model;

[0019] Calculate the contour coefficient of the sampled values;

[0020] Calculate the average contour coefficient corresponding to the current k value based on the contour coefficient;

[0021] Increment the current value of k by 1 to obtain the updated value of k;

[0022] If the updated k value is greater than the upper limit k max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0, and the clustering result when the number of clusters is k0 is taken as the final clustering result, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0023] If the updated k value is not greater than the upper limit k max If so, return and re-execute the step of performing one-dimensional clustering on the sampled values.

[0024] In one possible implementation, the use of a convolutional neural network to extract features from the historical power load data and the external influencing factor data to obtain feature vectors for the historical power load data and the external influencing factor data includes:

[0025] The historical power load data is normalized to obtain normalized historical power load data.

[0026] The t1-dimensional input vector of the convolutional neural network is obtained from the normalized historical power load data, wherein the t1-dimensional input vector is the normalized historical power load data corresponding to the previous t1 times.

[0027] Multiple convolutional windows are used to perform convolution operations and max pooling on the t1-dimensional input vector to obtain multiple feature values;

[0028] The multiple feature values ​​are concatenated to obtain the historical power load data feature vector.

[0029] In one possible implementation, the step of using a convolutional neural network to extract features from the historical power load data and the external influencing factor data to obtain feature vectors for the historical power load data and the external influencing factor data further includes:

[0030] The external influencing factor data is normalized to obtain normalized external influencing factor data;

[0031] The t2-dimensional input vector of the convolutional neural network is obtained from the normalized external influencing factor data, wherein the t2-dimensional input vector is the normalized external influencing factor data corresponding to the previous t2 time steps.

[0032] The external influencing factor data of the next t3 times after the current time are used as the t3-dimensional input vector of the convolutional neural network;

[0033] The t2-dimensional input vector and the t3-dimensional input vector are combined to form a (t2+t3)-dimensional input vector;

[0034] Multiple convolution windows are used to perform convolution operations on the (t2+t3) dimensional input vector to generate the feature vector of the external influencing factors.

[0035] In one possible implementation, the pre-trained industry power load forecasting model includes a support vector regression model. The step of inputting the historical power load data feature vector, the external influencing factor feature vector, and the correlation analysis results into the pre-trained industry power load forecasting model to obtain the current industry power load forecast value includes:

[0036] The standard mutual information value is normalized to obtain a normalized standard mutual information value;

[0037] Using the normalized standard mutual information value as the weight of the feature vector of the external influencing factor data, the feature vector of the external influencing factor data is weighted and summed to obtain the fusion vector;

[0038] The historical power load data feature vector is concatenated with the fused vector to obtain the concatenated vector;

[0039] The support vector regression model is used to predict the spliced ​​vector to obtain the industry power load forecast value at the current moment.

[0040] Secondly, this application provides a power load forecasting system based on the correlation of external influencing factors, including:

[0041] The acquisition module is used to acquire raw data, which includes historical power load data within a historical time period and data on external factors affecting power load forecasting.

[0042] The quantitative analysis module is used to perform quantitative analysis on the nonlinear correlation between the historical power load data and the external influencing factor data based on an optimized k-means clustering model and a correlation analysis method combining standard mutual information, and to obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient;

[0043] The feature extraction module is used to extract features from the historical power load data and the external influencing factor data using a convolutional neural network, so as to obtain the feature vector of the historical power load data and the feature vector of the external influencing factor data.

[0044] The prediction module is used to input the historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results into a pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment.

[0045] In one possible implementation, the quantitative analysis module includes:

[0046] The clustering submodule is used to perform clustering processing on the historical power load data and the external influencing factor data based on the optimized k-means clustering model, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0047] The first replacement submodule is used to replace the sampled values ​​of multiple clusters of the historical power load data with the cluster center values ​​corresponding to the multiple clusters of the historical power load data respectively, so as to obtain the replaced historical power load data.

[0048] The second replacement submodule is used to replace the sampled values ​​of multiple clusters of the external influencing factor data with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data respectively, so as to obtain the replaced external influencing factor data.

[0049] The calculation submodule is used to calculate the standard mutual information using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

[0050] In one possible implementation, the clustering submodule includes:

[0051] The input unit is used to input the sampled values ​​of the historical power load data and the external influencing factor data within the historical time period;

[0052] Clustering unit, used to perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model;

[0053] The first calculation unit is used to calculate the contour coefficient of the sampled value;

[0054] The second calculation unit is used to calculate the average contour coefficient corresponding to the current k value based on the contour coefficient.

[0055] The incrementing unit is used to increment the value of k by 1 to obtain the updated value of k.

[0056] The first judgment unit is used to determine if the updated k value is greater than the upper limit k. max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0, and the clustering result when the number of clusters is k0 is taken as the final clustering result, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0057] The second judgment unit is used to determine whether the updated k value is not greater than the upper limit k. max If so, return and re-execute the step of performing one-dimensional clustering on the sampled values.

[0058] In one possible implementation, the pre-trained industry power load forecasting model includes a support vector regression model, and the forecasting module includes:

[0059] The standard mutual information value normalization submodule is used to normalize the standard mutual information value to obtain a normalized standard mutual information value.

[0060] The weighted summation submodule is used to perform weighted summation on the feature vector of the external influencing factors data, using the normalized standard mutual information value as the weight, to obtain the fusion vector;

[0061] The feature vector splicing submodule is used to splice the feature vector of the historical power load data with the fused vector to obtain the spliced ​​vector;

[0062] The power load forecasting submodule is used to use the support vector regression model to predict the spliced ​​vector and obtain the industry power load forecast value at the current moment.

[0063] Compared with the prior art, the present invention has the following advantages:

[0064] This application discloses a power load forecasting method and system based on the correlation of external influencing factors. The method includes: acquiring historical power load data and external influencing factor data within a historical time period; performing a correlation analysis method combining a k-means clustering model optimized with average profile coefficient and standard mutual information to quantify the nonlinear correlation between the historical power load data and the external influencing factor data, obtaining the correlation analysis result; using a convolutional neural network to extract features to obtain feature vectors for historical power load data and external influencing factor data; and inputting the historical power load data feature vectors, the external influencing factor data feature vectors, and the correlation analysis result into a pre-trained industry power load forecasting model to obtain the industry power load forecast value at the current moment. This application, based on the correlation analysis method combining a k-means clustering model optimized with average profile coefficient and standard mutual information, accurately quantifies the nonlinear correlation between historical power load data and external influencing factor data. The correlation analysis result is applied to a short-term industry power load forecasting model based on a convolutional neural network, effectively improving the accuracy and stability of short-term power load forecasting within the industry. Attached Figure Description

[0065] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0066] Figure 1 This is a flowchart of a power load forecasting method based on the correlation of external influencing factors disclosed in an embodiment of this application;

[0067] Figure 2 This is a schematic diagram of the industry power load forecasting model disclosed in the embodiments of this application;

[0068] Figure 3 This is a block diagram of a power load forecasting system based on the correlation of external influencing factors disclosed in an embodiment of this application. Detailed Implementation

[0069] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0070] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

[0071] The inventors discovered that short-term power load forecasting within the industry is not only intrinsically correlated with historical power load data, but is also affected by external influencing factors such as weather and day type. When considering both historical power load data and external influencing factor data, existing technologies cannot accurately analyze the nonlinear correlation between historical power load data and external influencing factor data, which may lead to low accuracy in short-term power load forecasting within the industry.

[0072] This application employs an optimized k-means clustering model combined with a standard mutual information correlation analysis method to quantify the correlation between power load and various external influencing factors. Then, a convolutional neural network is used to extract feature vectors from the original data of historical power load and external influencing factors. Considering the correlation between historical power load and external influencing factors, the feature vectors are fused and predicted to obtain short-term power load forecasting results for the industry. Based on an accurate analysis of the nonlinear correlation between historical power load data and external influencing factor data within the industry, the correlation analysis results are applied to the power load forecasting process to improve the accuracy of short-term power load forecasting within the industry.

[0073] To facilitate understanding of the technical solutions provided in the embodiments of this application, the following description, in conjunction with the accompanying drawings, illustrates a power load forecasting method based on the correlation of external influencing factors provided in the embodiments of this application.

[0074] See Figure 1 The figure is a flowchart illustrating a power load forecasting method based on the correlation of external influencing factors provided in an embodiment of this application. This method is applied to scenarios in various industries for short-term analysis and forecasting of power load, and includes the following steps S1 to S4:

[0075] S1: Obtain raw data, which includes historical power load data within a historical time period and data on external factors affecting power load forecasting.

[0076] In this embodiment of the application, the raw data can be obtained from the database. The raw data may include historical power load data within a historical time period and data on external influencing factors that affect power load forecasting. The data types of the external influencing factors include temperature, humidity, wind speed, precipitation, whether it is a holiday, etc. The data types of external influencing factors used for short-term power load forecasting in the industry can be selected according to the actual situation, and this application does not limit them.

[0077] S2: Based on the correlation analysis method combining the optimized k-means clustering model and standard mutual information, the nonlinear correlation between the historical power load data and the external influencing factor data is quantitatively analyzed to obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient.

[0078] In this embodiment, the k-means clustering model is optimized using the average profile coefficient. Based on the optimized k-means clustering model and a correlation analysis method combining standard mutual information, the nonlinear correlation between historical power load data and external influencing factor data is analyzed. This ensures that the obtained correlation analysis results effectively reflect the correlation between industry power load and external influencing factors. The specific implementation steps for obtaining the correlation analysis results in this embodiment are shown in S21 to S23:

[0079] S21: Based on the optimized k-means clustering model, the historical power load data and the external influencing factor data are clustered to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0080] The k-means clustering model is used to cluster unsupervised clustering problems. However, the k-means clustering model requires manual specification of the number of clusters k, which has weak adaptability and strong subjectivity. The clustering effect is easily affected by the amount and distribution of data. In this application, the average silhouette coefficient is used to determine the optimal value of k to optimize the k-means clustering model. The optimized k-means clustering model can avoid dependence on human experience in the clustering process and improve the adaptability of the clustering model. The specific process of clustering historical power load data and external influencing factor data by the optimized k-means clustering model is shown in steps S211 to S217:

[0081] S211: Input the sampled values ​​z(c) of the data to be clustered within the historical time period, where (c∈{1,2,…,C}).

[0082] The sampled values ​​z(c) of the data to be clustered include historical power load data y(c) and external influencing factor data x(c). In the following calculation process, clustering based on the sampled values ​​z(c) is used as an example. In practical applications, clustering is performed separately on the historical power load data y(c) and the external influencing factor data x(c). The process for clustering the historical power load data y(c) and the external influencing factor data x(c) is the same. This demonstration only uses the sampled values ​​z(c) for clustering:

[0083] S212: Perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model.

[0084] Optionally, the absolute value of the difference between the sampled value and the cluster center value can be used as the criterion for dividing the clusters. In the initial clustering, k is set to 2 for subsequent calculations.

[0085] S213: After clustering is completed, calculate the silhouette coefficient l(c) of the sampled value z(c).

[0086] If the sampled value z(c) is an isolated point, that is, there is only one sampled value z(c) in the cluster where z(c) is located, then its silhouette coefficient l(c) is 0; otherwise, the silhouette coefficient l(c) of the sampled value z(c) is calculated according to the following formula.

[0087]

[0088]

[0089]

[0090]

[0091] In the formula, l1(c) is the intra-cluster cohesion, l2(p,c) is the inter-cluster separation between the cluster containing z(c) and the p-th cluster, l3(c) is the minimum inter-cluster separation, z(j) is the other sampled value within the cluster containing z(c), n1(c) is the number of sampled values ​​z(c) within the cluster containing z(c), z(q) is the sampled value within the p-th cluster excluding the cluster containing z(c), and n2(p,c) is the number of sampled values ​​within the p-th cluster excluding the cluster containing z(c).

[0092] S214: Calculate the average profile coefficient corresponding to the current k value based on the profile coefficient.

[0093] After calculating the profile coefficient l(c) corresponding to each sample value z(c), the average profile coefficient l0(k) corresponding to the current k value is obtained using the following formula:

[0094]

[0095] S215: Increment the current value of k by 1 to obtain the updated value of k.

[0096] S216: If the updated k value is greater than the upper limit k max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0. The clustering result with the number of clusters k0 is taken as the final clustering result, and the cluster center values ​​corresponding to the multiple clusters of the historical power load data and the multiple clusters of the external influencing factor data are obtained respectively.

[0097] After obtaining the average profile coefficient corresponding to the current k value, increment the current k value by 1. If the updated k value is greater than the upper limit k, then... max Then take all average profile coefficients l0(k) (k∈{2,3,…,k) max The value of k corresponding to the maximum value in}) is taken as the optimal number of clusters, and denoted as k0. The clustering result in step S212 with the number of clusters being k0 is taken as the final clustering result, and the cluster center value is obtained. In this step, the optimized k-means clustering model can form k0 clusters after clustering, and each cluster has one cluster center value, which is the average of all sampled values ​​within the cluster. Optionally, an upper limit k is set for the value of k. max It is 10.

[0098] S217: If the updated k value is not greater than the upper limit k max If so, return to step S212 and re-execute the one-dimensional clustering of the sampled values.

[0099] Following steps S211 to S216 above, the historical power load data y(c) and the external influencing factor data x(c) are clustered to obtain the cluster center value for each cluster in different data sets. The sampled values ​​within each cluster are replaced with these cluster center values ​​to obtain the replaced historical power load data and external influencing factor data, which can be used to calculate the correlation analysis results.

[0100] S22: Replace the sampled values ​​of multiple clusters of the historical power load data with the cluster center values ​​corresponding to the multiple clusters of the historical power load data to obtain the replaced historical power load data; replace the sampled values ​​of multiple clusters of the external influencing factor data with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data to obtain the replaced external influencing factor data.

[0101] In this embodiment, the cluster center value corresponding to each cluster is replaced with the sampled value in the cluster corresponding to the cluster center value to obtain the replaced historical power load data and the replaced external influencing factor data. The replaced historical power load data and the replaced external influencing factor data are used for the calculation of standard mutual information.

[0102] S23: Calculate the standard mutual information using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

[0103] In this embodiment, the correlation between historical power load data and external influencing factor data is analyzed, fully considering their nonlinear correlation. The correlation analysis results include the standard mutual information value calculated using the standard mutual information method. The standard mutual information value, calculated using the standard mutual information method, serves as an indicator for quantifying the correlation, enabling the accurate capture of nonlinear correlations. The specific standard mutual information calculation is as follows:

[0104] The sampled values ​​within the historical period include the historical power load data y(c) to be analyzed and the external influencing factor data x(c), where (c∈{1,2,…,C}). All the external influencing factor data x(c) constitute the external influencing factor data sequence X, and all the historical power load data y(c) constitute the historical power load data sequence Y. In the external influencing factor data x(c), x(c) has M possible values ​​(M≤C), denoted as x1(m) (m∈{1,2,…,M}). In the historical power load data y(c), y(c) has N possible values ​​(N≤C), denoted as y1(c) (n∈{1,2,…,N}).

[0105] When using standard mutual information to calculate the correlation between historical power load data and external influencing factor data, if the standard mutual information values ​​of external influencing factor data sequence X and historical power load data sequence Y are directly calculated, the formula is as follows:

[0106]

[0107]

[0108]

[0109]

[0110] Where I(X;Y) is the mutual information value between sequence X and sequence Y, H(X) and H(Y) are the information entropy of sequence X and sequence Y, respectively, P(x1(m),y1(n)) represents the proportion of sampling times that simultaneously satisfy x(c)=x1(m) and y(c)=y1(n) among all sampling times, P(x1(m)) represents the proportion of sampling times that satisfy x(c)=x1(m) among all sampling times, and P(y1(n)) represents the proportion of sampling times that satisfy y(c)=y1(n) among all sampling times. J(X;Y) takes values ​​in the range [0,1]. The larger the standard mutual information value, the stronger the correlation between external influencing factor data and historical power load data.

[0111] However, directly using sequences X and Y for standard mutual information calculation can make the correlation analysis susceptible to subtle numerical differences, failing to capture the main trends in variable changes. For example, for the two sets of data in Table 1, the standard mutual information values ​​J(X;Y) calculated using the above standard mutual information method are 1 and 0, respectively, but in reality, the corresponding data only differ slightly in the decimal places. In practice, when the external influencing factor temperature value x(c) changes significantly, while the distribution of historical power load data y(c) remains essentially unchanged, the correlation between temperature and historical power load data should be considered weak. Therefore, taking J(X;Y) = 0 is more reasonable.

[0112]

[0113] Table 1. Examples of Temperature and Load Sampling Data

[0114] To avoid the correlation analysis results being affected by subtle numerical differences, this embodiment of the application uses the cluster center values ​​generated by the optimized k-means clustering model in step S21 to replace the sampled values ​​in the clusters and performs standard mutual information calculation after clustering.

[0115] Let u(c) be the cluster center value of the cluster containing the external influencing factor data x(c), and all cluster center values ​​u(c) constitute a sequence U. Let v(c) be the cluster center value of the historical power load data y(c) and its cluster, and all cluster center values ​​v(c) constitute a sequence V. u(c) has F possible values ​​(F≤C), denoted as u1(f) (f∈{1,2,…,F}); v(c) has G possible values ​​(G≤C), denoted as v1(g) (g∈{1,2,…,G}). The standard mutual information value J0(U;V) of the historical power load data and external influencing factor data after replacing the cluster center values ​​is calculated as follows:

[0116]

[0117]

[0118]

[0119]

[0120] Where I0(U;V) is the mutual information value between sequence U and sequence V, H0(U) and H0(V) are the information entropy of sequence U and sequence V, respectively, P(u1(f),v1(g)) represents the proportion of sampling times that simultaneously satisfy u(c)=u1(f) and v(c)=v1(g) among all sampling times, P(u1(f)) represents the proportion of sampling times that satisfy u(c)=u1(f) among all sampling times, and P(v1(g)) represents the proportion of sampling times that satisfy v(c)=v1(g) among all sampling times. J0(U;V) takes values ​​in the range [0,1]. The larger the standard mutual information value, the stronger the correlation between external influencing factor data and historical power load data.

[0121] The data obtained after replacing the cluster center values ​​in the two sets of data in Table 1 are shown in Table 2. The J0(U;V) values ​​obtained by the standard mutual information value calculation method after replacing the cluster center values ​​in the two sets of data in Table 2 are both 0, which is consistent with the actual application. It can be seen that the standard mutual information value calculated after replacing the cluster center values ​​can effectively reflect the correlation between historical power load data and external influencing factor data.

[0122]

[0123] Table 2. Examples of temperature and load sampling data after cluster center replacement.

[0124] In the above steps, an optimized k-means clustering model and a correlation analysis method combining standard mutual information were used to accurately analyze and calculate the nonlinear correlation between historical power load data and external influencing factor data. By using a k-means clustering model optimized with average profile coefficient, the influence of subtle differences in variable values ​​on the analysis results was avoided. Based on this, standard mutual information can be used to capture accurate nonlinear correlations.

[0125] S3: Use a convolutional neural network to extract features from the historical power load data and the external influencing factor data to obtain the feature vectors of the historical power load data and the external influencing factor data.

[0126] In this embodiment, a convolutional neural network is used to extract features from the original data. The convolutional neural network can simultaneously extract information from multiple time spans through multi-sized windows to uncover key information in the original data for short-term load forecasting.

[0127] The following sections provide a detailed explanation of feature extraction for historical power load data and external influencing factor data.

[0128] The steps for feature extraction from historical power load data are shown in S311 to S314:

[0129] S311: Normalize the historical power load data to obtain normalized historical power load data.

[0130] The historical power load data is normalized using the Min-Max standardization method to obtain the normalized historical power load data, as follows:

[0131] Historical electricity load data y(c) (c∈{1,2,…,C}), where the maximum value is y. max The minimum value is y min Then the normalized value of each historical power load data y(c) is y norm (c) is:

[0132]

[0133] S312: Obtain the t1-dimensional input vector of the convolutional neural network from the normalized historical power load data, wherein the t1-dimensional input vector is the normalized historical power load data corresponding to the previous t1 times.

[0134] The normalized historical power load data from the previous t1 time points is used as input data to form an n-dimensional input vector, denoted as A. Optionally, t1 refers to the previous 72 hours.

[0135] S313: Multiple convolution windows are used to perform convolution operations and max pooling on the t1-dimensional input vector to obtain multiple feature values.

[0136] Multiple h×1 dimensional convolutional windows W are used to perform convolution operations on A, followed by max pooling to obtain the feature value e. Optionally, h can have multiple values, with different values ​​corresponding to different time periods.

[0137]

[0138] Among them, A a:a+h-1 Let b be a vector consisting of the values ​​of the a-th to a+h-1-th dimensions of the input vector A, where b is the bias term.

[0139] S314: The multiple feature values ​​are concatenated to obtain the historical power load data feature vector.

[0140] The feature values ​​obtained from multiple convolution windows are concatenated to obtain the historical power load data feature vector e0.

[0141] The steps for feature extraction from external influencing factor data are shown in S321 to S325:

[0142] S321: Normalize the external influencing factor data to obtain normalized external influencing factor data.

[0143] The external influencing factor data were normalized using the Min-Max standardization method to obtain the normalized external influencing factor data, as follows:

[0144] External influencing factor data x(c) (c∈{1,2,…,C}), where the maximum value is x. max The minimum value is x min Then the normalized value of each external influencing factor data x(c) is x norm (c) is:

[0145]

[0146] S322: Obtain the t2-dimensional input vector of the convolutional neural network from the normalized external influencing factor data, wherein the t2-dimensional input vector is the normalized external influencing factor data corresponding to the previous t2 time steps at the current time step.

[0147] S323: Use the external influencing factor data of the next t3 times after the current time as the t3-dimensional input vector of the convolutional neural network.

[0148] S324: Combine the t2-dimensional input vector with the t3-dimensional input vector to form a (t2+t3)-dimensional input vector.

[0149] The input data consists of all sampled values ​​from the previous t2 time points and all predicted values ​​of external influencing factors from the current time point to the next t3 time points, forming an input vector of (t2+t3) dimensions. Optionally, t2 specifically refers to the 72 hours prior to the current time, and t3 specifically refers to the 24 hours prior to the current time. The predicted values ​​of external influencing factors are obtained by predicting the data of external influencing factors.

[0150] S325: Multiple convolution windows are used to perform convolution operations on the (t2+t3) dimensional input vector to generate feature vectors of each of the external influencing factors.

[0151] Similar to step S313, multiple convolution windows are used to perform convolution operations on the (t2+t3) dimensional input vector. Each external influencing factor data generates a feature vector, where the feature vector of the r-th external influencing factor data is e. 1,r , r∈{1,2,…,R}, where R is the number of external influencing factors.

[0152] S4: Input the historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results into the pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment.

[0153] The impact of different external influencing factors on industry power load varies. Therefore, when forecasting power load for various industries, the correlation between industry power load data and the data of various external influencing factors should be fully considered. Thus, after generating feature vectors for historical power load data and feature vectors for external influencing factor data, the power load forecast value is calculated based on a short-term industry power load forecasting model, specifically including steps S41 to S44:

[0154] Adopting such Figure 2 The industry power load forecasting model shown provides a simplified explanation of the process for steps S41 to S44 below, following the direction indicated by the arrows. Figure 2 The eigenvector e of external influencing factors 1,r Using the normalized standard mutual information value J 1,r As weights, the weighted sum is used to obtain the fusion vector e2; the fusion vector e2 is concatenated with the historical power load data feature vector e0 to obtain the concatenated vector e3; the SVR model is used to predict the concatenated vector e3 to obtain the predicted power load value. Details for each step are provided below.

[0155] S41: Normalize the standard mutual information value to obtain the normalized standard mutual information value.

[0156] When predicting the power load of a certain industry, the standard mutual information value between the historical power load data and the data of each external influencing factor is calculated according to step S22 using historical power load data and external influencing factor data.

[0157] Let J be the standard mutual information value between historical power load data and the data of the r-th external influencing factor. 0,r The standard mutual information value is normalized to obtain the normalized standard mutual information value J between historical power load data and the data of the r-th external influencing factor. 1,r :

[0158]

[0159] S42: Using the normalized standard mutual information value as the weight of the feature vector of the external influencing factor data, the feature vector of the external influencing factor data is weighted and summed to obtain the fusion vector.

[0160] Using the normalized standard mutual information value J 1,r The feature vector e after convolution operation of the data of the r-th external influencing factor 1,r The weights of all external influencing factors are used to sum the feature vectors of all external influencing factors, resulting in the fusion vector e2:

[0161]

[0162] S43: The historical power load data feature vector is concatenated with the fusion vector to obtain the concatenated vector.

[0163] The historical power load data feature vector e0 and the fused vector e2 are concatenated to obtain the concatenated vector e3:

[0164]

[0165] S44: Use the support vector regression model to predict the spliced ​​vector to obtain the industry power load forecast value at the current moment.

[0166] In this embodiment of the application, a support vector regression (SVR) model is used to predict the spliced ​​vector to obtain the industry power load forecast value at the current moment.

[0167] Applying the correlation analysis results between historical power load data and external influencing factor data to industry power load forecasting models based on convolutional neural networks can effectively improve the accuracy of short-term power load forecasting within the industry.

[0168] In this embodiment of the application, before making predictions, it is necessary to train the convolutional neural network and the industry power load prediction model as a whole. The training process is as follows:

[0169] Historical power load data and external influencing factor data for a selected historical period (40,000 hours) are used as the training set, and the historical power load data represents the actual power load data within that period. The corresponding input vectors and output values ​​are constructed according to steps S3-S4. The convolutional neural network and the industry power load prediction model are trained using a backpropagation method. During training, the parameters of the convolutional neural network and the SVR model are automatically calculated and adjusted, and all parameters are retained after training. Power load prediction is then performed using the trained convolutional neural network and the industry power load prediction model. For a given prediction time, the calculations are performed according to steps S1-S4 to obtain the predicted power load value for that time.

[0170] This application provides a power load forecasting method based on the correlation of external influencing factors. The method includes: acquiring historical power load data and data on external influencing factors affecting power load forecasting within a historical time period; performing quantitative analysis on the nonlinear correlation between the historical power load data and the external influencing factor data based on a correlation analysis method combining a k-means clustering model optimized with average profile coefficient and standard mutual information, to obtain correlation analysis results; using a convolutional neural network for feature extraction to obtain feature vectors for historical power load data and external influencing factor data; and inputting the feature vectors for historical power load data, the feature vectors for external influencing factor data, and the correlation analysis results into a pre-trained industry power load forecasting model to obtain the industry power load forecast value at the current moment. This application provides an accurate quantitative analysis of the nonlinear correlation between historical power load data and external influencing factor data based on a k-means clustering model optimized with average profile coefficient and standard mutual information. The correlation analysis results are applied to a short-term industry power load forecasting model based on a convolutional neural network, effectively improving the accuracy and stability of short-term power load forecasting within the industry.

[0171] The foregoing embodiments of this application provide a power load forecasting method based on the correlation of external influencing factors. Next, this application also describes a power load forecasting system based on the correlation of external influencing factors, which performs the aforementioned... Figure 1 The method shown below will be followed by a description of the functionality of the power load forecasting system based on the correlation of external influencing factors. The structural block diagram of the power load forecasting system is shown below. Figure 3 As shown, it includes:

[0172] The module includes module 51, module 52, module 53, feature extraction module, and module 54.

[0173] in,

[0174] The acquisition module 51 is used to acquire raw data, which includes historical power load data within a historical time period and data on external influencing factors affecting power load forecasting.

[0175] The quantitative analysis module 52 is used to perform quantitative analysis on the nonlinear correlation between the historical power load data and the external influencing factor data based on a correlation analysis method that combines an optimized k-means clustering model and standard mutual information, and obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient;

[0176] The feature extraction module 53 is used to extract features from the historical power load data and the external influencing factor data using a convolutional neural network to obtain the feature vector of the historical power load data and the feature vector of the external influencing factor data.

[0177] The prediction module 54 is used to input the historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results into the pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment.

[0178] In one possible implementation, the quantification analysis module 52 may include:

[0179] The clustering submodule is used to perform clustering processing on the historical power load data and the external influencing factor data based on the optimized k-means clustering model, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0180] The first replacement submodule is used to replace the sampled values ​​of multiple clusters of the historical power load data with the cluster center values ​​corresponding to the multiple clusters of the historical power load data respectively, so as to obtain the replaced historical power load data.

[0181] The second replacement submodule is used to replace the sampled values ​​of multiple clusters of the external influencing factor data with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data respectively, so as to obtain the replaced external influencing factor data.

[0182] The calculation submodule is used to calculate the standard mutual information using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

[0183] In one possible implementation, the clustering submodule may include:

[0184] The input unit is used to input the sampled values ​​of the historical power load data and the external influencing factor data within the historical time period;

[0185] Clustering unit, used to perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model;

[0186] The first calculation unit is used to calculate the contour coefficient of the sampled value;

[0187] The second calculation unit is used to calculate the average contour coefficient corresponding to the current k value based on the contour coefficient.

[0188] The incrementing unit is used to increment the current k value by 1 to obtain the updated k value;

[0189] The first judgment unit is used to determine if the updated k value is greater than the upper limit k. max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0, and the clustering result when the number of clusters is k0 is taken as the final clustering result, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data.

[0190] The second judgment unit is used to determine whether the updated k value is not greater than the upper limit k. max If so, return and re-execute the step of performing one-dimensional clustering on the sampled values.

[0191] In one possible implementation, the feature extraction module 53 may include:

[0192] The historical power load data normalization submodule is used to normalize the historical power load data to obtain normalized historical power load data.

[0193] The first acquisition submodule is used to acquire the t1-dimensional input vector of the convolutional neural network from the normalized historical power load data, wherein the t1-dimensional input vector is the normalized historical power load data corresponding to the previous t1 times at the current time.

[0194] The first convolutional submodule is used to perform convolution operations and max pooling on the t1-dimensional input vector using multiple convolutional windows to obtain multiple feature values;

[0195] The feature value splicing submodule is used to splice the multiple feature values ​​to obtain the feature vector of the historical power load data.

[0196] In one possible implementation, the feature extraction module 53 may further include:

[0197] The external influencing factor data normalization submodule is used to normalize the external influencing factor data to obtain normalized external influencing factor data.

[0198] The second acquisition submodule is used to acquire the t2-dimensional input vector of the convolutional neural network from the normalized external influencing factor data, wherein the t2-dimensional input vector is the normalized external influencing factor data corresponding to the previous t2 times at the current time.

[0199] The third acquisition submodule is used to take the external influencing factor data of the next t3 times after the current time as the t3-dimensional input vector of the convolutional neural network;

[0200] The combination submodule is used to combine the t2-dimensional input vector and the t3-dimensional input vector into a (t2+t3)-dimensional input vector;

[0201] The second convolution submodule is used to perform convolution operations on the (t2+t3) dimensional input vector using multiple convolution windows to generate the feature vector of the external influencing factors data.

[0202] In one possible implementation, the prediction module 54 may include:

[0203] The standard mutual information value normalization submodule is used to normalize the standard mutual information value to obtain a normalized standard mutual information value.

[0204] The weighted summation submodule is used to perform weighted summation on the feature vector of the external influencing factors data, using the normalized standard mutual information value as the weight, to obtain the fusion vector;

[0205] The feature vector splicing submodule is used to splice the feature vector of the historical power load data with the fused vector to obtain the spliced ​​vector;

[0206] The power load forecasting submodule is used to use the support vector regression model to predict the spliced ​​vector and obtain the industry power load forecast value at the current moment.

[0207] This application provides a power load forecasting system based on the correlation of external influencing factors. The system includes: an acquisition module for acquiring historical power load data and data of external influencing factors affecting power load forecasting within a historical time period; a quantitative analysis module for performing quantitative analysis on the nonlinear correlation between the historical power load data and the external influencing factor data based on a correlation analysis method combining a k-means clustering model optimized with average profile coefficient and standard mutual information, and obtaining the correlation analysis result; a feature extraction module for using a convolutional neural network to extract features to obtain feature vectors of historical power load data and feature vectors of external influencing factor data; and a forecasting module for inputting the feature vectors of historical power load data, the feature vectors of external influencing factor data, and the correlation analysis result into a pre-trained industry power load forecasting model to obtain the industry power load forecast value at the current moment. This application embodiment is based on a correlation analysis method that combines a k-means clustering model optimized with average profile coefficient and standard mutual information to accurately quantify the nonlinear correlation between historical power load data and external influencing factor data. The correlation analysis results are applied to an industry short-term power load forecasting model based on convolutional neural networks, which effectively improves the accuracy and stability of short-term power load forecasting within the industry.

[0208] It should be noted that the various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For apparatus embodiments, since they are basically similar to method embodiments, the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0209] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0210] For ease of description, the above apparatus is described by dividing it into various functional units. Of course, in implementing this invention, the functions of each unit can be implemented in one or more software and / or hardware components.

[0211] As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of software plus necessary general-purpose hardware platforms. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of the present invention.

[0212] The above provides a detailed description of the power load forecasting method and system based on the correlation of external influencing factors provided by the present invention. Specific examples have been used to illustrate the principles and implementation methods of the present invention. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of the present invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the present invention. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A power load prediction method based on the correlation of external influencing factors, characterized by, include: Obtain raw data, which includes historical power load data within a historical time period and data on external factors affecting power load forecasting; Based on an optimized k-means clustering model and a correlation analysis method combining standard mutual information, the nonlinear correlation between the historical power load data and the external influencing factor data is quantitatively analyzed to obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient. A convolutional neural network is used to extract features from the historical power load data and the external influencing factor data to obtain feature vectors for the historical power load data and external influencing factor data. The historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results are input into the pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment. The correlation analysis method based on the optimized k-means clustering model and standard mutual information is used to quantify the nonlinear correlation between the historical power load data and the external influencing factor data, and the correlation analysis results are obtained, including: Based on the optimized k-means clustering model, the historical power load data and the external influencing factor data are clustered respectively to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data. The sampled values ​​of multiple clusters of the historical power load data are replaced with the cluster center values ​​corresponding to the multiple clusters of the historical power load data to obtain the replaced historical power load data; The sampled values ​​of multiple clusters of the external influencing factor data are replaced with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data to obtain the replaced external influencing factor data; The standard mutual information is calculated using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

2. The method according to claim 1, characterized in that, The optimized k-means clustering model performs clustering processing on the historical power load data and the external influencing factor data respectively, obtaining the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data, including: Input the historical power load data and the sampled values ​​of the external influencing factors data within the historical time period; Perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model; Calculate the contour coefficient of the sampled values; Calculate the average contour coefficient corresponding to the current k value based on the contour coefficient; Increment the current value of k by 1 to obtain the updated value of k; If the updated k value is greater than the upper limit k max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0, and the clustering result when the number of clusters is k0 is taken as the final clustering result, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data. If the updated k value is not greater than the upper limit k max If so, return and re-execute the step of performing one-dimensional clustering on the sampled values.

3. The method according to claim 1, characterized in that, The process of using a convolutional neural network to extract features from the historical power load data and the external influencing factor data yields feature vectors for the historical power load data and external influencing factor data, including: The historical power load data is normalized to obtain normalized historical power load data. The t1-dimensional input vector of the convolutional neural network is obtained from the normalized historical power load data, wherein the t1-dimensional input vector is the normalized historical power load data corresponding to the previous t1 times. Multiple convolutional windows are used to perform convolution operations and max pooling on the t1-dimensional input vector to obtain multiple feature values; The multiple feature values ​​are concatenated to obtain the historical power load data feature vector.

4. The method of claim 1, wherein, The step of using a convolutional neural network to extract features from the historical power load data and the external influencing factor data to obtain feature vectors for the historical power load data and the external influencing factor data further includes: The external influencing factor data is normalized to obtain normalized external influencing factor data; The t2-dimensional input vector of the convolutional neural network is obtained from the normalized external influencing factor data, wherein the t2-dimensional input vector is the normalized external influencing factor data corresponding to the previous t2 time steps at the current time step; The external influencing factor data of the next t3 times after the current time are used as the t3-dimensional input vector of the convolutional neural network; The t2-dimensional input vector and the t3-dimensional input vector are combined to form a (t2+t3)-dimensional input vector; Multiple convolution windows are used to perform convolution operations on the (t2+t3) dimensional input vector to generate the feature vector of the external influencing factors.

5. The method of claim 1, wherein, The pre-trained industry power load forecasting model includes a support vector regression model. The step of inputting the historical power load data feature vectors, the external influencing factor feature vectors, and the correlation analysis results into the pre-trained industry power load forecasting model to obtain the current industry power load forecast value includes: The standard mutual information value is normalized to obtain a normalized standard mutual information value; Using the normalized standard mutual information value as the weight of the feature vector of the external influencing factor data, the feature vector of the external influencing factor data is weighted and summed to obtain the fusion vector; The historical power load data feature vector is concatenated with the fused vector to obtain the concatenated vector; The support vector regression model is used to predict the spliced ​​vector to obtain the industry power load forecast value at the current moment.

6. A power load forecasting system based on correlation of external influencing factors, characterized by, include: The acquisition module is used to acquire raw data, which includes historical power load data within a historical time period and data on external factors affecting power load forecasting. The quantitative analysis module is used to perform quantitative analysis on the nonlinear correlation between the historical power load data and the external influencing factor data based on an optimized k-means clustering model and a correlation analysis method combining standard mutual information, and to obtain the correlation analysis results; the optimized k-means clustering model is a k-means clustering model that optimizes the initial k value using the average profile coefficient; The feature extraction module is used to extract features from the historical power load data and the external influencing factor data using a convolutional neural network, so as to obtain the feature vector of the historical power load data and the feature vector of the external influencing factor data. The prediction module is used to input the historical power load data feature vector, the external influencing factor data feature vector, and the correlation analysis results into the pre-trained industry power load prediction model to obtain the industry power load prediction value at the current moment. The quantitative analysis module includes: The clustering submodule is used to perform clustering processing on the historical power load data and the external influencing factor data based on the optimized k-means clustering model, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data. The first replacement submodule is used to replace the sampled values ​​of multiple clusters of the historical power load data with the cluster center values ​​corresponding to the multiple clusters of the historical power load data respectively, so as to obtain the replaced historical power load data. The second replacement submodule is used to replace the sampled values ​​of multiple clusters of the external influencing factor data with the cluster center values ​​corresponding to the multiple clusters of the external influencing factor data respectively, so as to obtain the replaced external influencing factor data. The calculation submodule is used to calculate the standard mutual information using the replaced historical power load data and the replaced external influencing factor data to obtain the correlation analysis results, which include the standard mutual information value calculated through the standard mutual information.

7. The system of claim 6, wherein, The clustering submodule includes: The input unit is used to input the sampled values ​​of the historical power load data and the external influencing factor data within the historical time period; Clustering unit, used to perform one-dimensional clustering on the sampled values ​​to generate k clusters corresponding to the current k value in the k-means clustering model; The first calculation unit is used to calculate the contour coefficient of the sampled value; The second calculation unit is used to calculate the average contour coefficient corresponding to the current k value based on the contour coefficient. The incrementing unit is used to increment the value of k by 1 to obtain the updated value of k. The first judgment unit is used to determine if the updated k value is greater than the upper limit k. max Then, the k value corresponding to the maximum value among all average profile coefficients is taken as the optimal number of clusters k0, and the clustering result when the number of clusters is k0 is taken as the final clustering result, so as to obtain the cluster center values ​​corresponding to multiple clusters of the historical power load data and the cluster center values ​​corresponding to multiple clusters of the external influencing factor data. The second judgment unit is used to determine whether the updated k value is not greater than the upper limit k. max If so, return and re-execute the step of performing one-dimensional clustering on the sampled values.

8. The system of claim 6, wherein, The pre-trained industry power load forecasting model includes a support vector regression model, and the forecasting module includes: The standard mutual information value normalization submodule is used to normalize the standard mutual information value to obtain a normalized standard mutual information value. The weighted summation submodule is used to perform weighted summation on the feature vector of the external influencing factors data, using the normalized standard mutual information value as the weight, to obtain the fusion vector; The feature vector splicing submodule is used to splice the feature vector of the historical power load data with the fused vector to obtain the spliced ​​vector; The power load forecasting submodule is used to use the support vector regression model to predict the spliced ​​vector and obtain the industry power load forecast value at the current moment.