Industrial enterprise power load prediction method and device, electronic equipment and storage medium
By decomposing and constructing features from the power load data of industrial enterprises, and optimizing hyperparameters using a gradient boosting model, the problem of low accuracy in power load prediction in existing technologies has been solved, achieving higher accuracy and faster prediction results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
- Filing Date
- 2021-12-13
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies struggle to effectively capture the temporal correlation of power load data from industrial enterprises, resulting in low accuracy in power load forecasting.
The measured power load dataset is decomposed using empirical mode decomposition and variational mode decomposition methods to generate multiple fluctuation subsequences. The input feature set is constructed by combining the load influencing factor data, and a Bayesian-optimized gradient boosting model is used for prediction. The hyperparameters are optimized by the XGboost model to improve the prediction accuracy.
By deeply mining the time series patterns of load data, the accuracy and speed of power load forecasting for industrial enterprises have been significantly improved.
Smart Images

Figure CN115374988B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of power technology, and in particular to a method, apparatus, electronic device, and readable storage medium for predicting power load in industrial enterprises. Background Technology
[0002] With the rapid development of my country's industry, the proportion of electricity consumption by industrial enterprises in my country's power grid has increased significantly. This means that the electricity consumption behavior of industrial enterprises will have a significant impact on the supply and demand balance of the power system.
[0003] To alleviate the imbalance between power supply and demand in the power system, short-term power load forecasting technology for industrial enterprises has emerged. Short-term industrial load forecasting refers to predicting the power load of industrial enterprises for the next 1 to 7 days. Only by considering the relevant influencing factors of industrial enterprise load and adopting appropriate load forecasting methods can the load demand of industrial users be predicted in a targeted manner. In modern society, the relationship between power load and its exogenous factors is becoming increasingly complex, rendering traditional forecasting methods ineffective. Related technologies employ artificial intelligence forecasting methods for load forecasting. Compared to the entire power grid, industrial enterprises have smaller power loads and more random electricity consumption behaviors, making it more difficult for existing forecasting methods to effectively capture the temporal correlation of industrial user load data, thus affecting the accuracy of power load forecasting. Summary of the Invention
[0004] This application provides a method, apparatus, electronic device, and readable storage medium for predicting the power load of industrial enterprises, thereby improving the accuracy of power load prediction for industrial enterprises.
[0005] To address the aforementioned technical problems, the embodiments of the present invention provide the following technical solutions:
[0006] One embodiment of the present invention provides a method for predicting the power load of industrial enterprises, comprising:
[0007] The measured data set of power load of industrial enterprises to be predicted is decomposed to obtain multiple fluctuation subsequences;
[0008] Based on the load influencing factors data and the load data feature set corresponding to each fluctuation subsequence, a corresponding input feature set is generated for each fluctuation subsequence.
[0009] For each fluctuation subsequence, the constructed prediction sub-model is used to predict the target features in the corresponding input feature set;
[0010] The predicted load value of the industrial enterprise to be tested is determined based on the prediction results of each prediction sub-model.
[0011] Optionally, the measured data set of the power load of the industrial enterprise to be predicted is decomposed to obtain multiple fluctuation subsequences, including:
[0012] The number of decomposition levels of the measured power load dataset is calculated using the empirical mode decomposition method.
[0013] The number of decomposition levels is assigned to the variational mode decomposition method, and the variational mode decomposition method is used to decompose the measured power load dataset to obtain multiple fluctuation subsequences.
[0014] Optionally, before decomposing the measured data set of the power load of the industrial enterprise to be predicted, the method further includes:
[0015] Collect active power measurement data of the industrial enterprise to be predicted within a preset time period according to the preset sampling frequency;
[0016] If there are missing data in the active power measurement data, obtain the historical active power measurement data at the same time in the previous cycle corresponding to the missing data.
[0017] The missing data is filled in using the historical active power measurement data to generate the measured power load dataset.
[0018] Optionally, the step of generating a corresponding input feature set for each fluctuation subsequence based on the load influencing factor data and the load data feature set corresponding to each fluctuation subsequence includes:
[0019] Obtain load influencing factor data, and construct load influencing factor characteristics according to the data type of the load influencing factor data;
[0020] The sliding window method is used to construct a load data feature set for each volatile subsequence;
[0021] For each fluctuating subsequence, the load influencing factor features are placed into the load data feature set of the current fluctuating subsequence to generate the corresponding input feature set.
[0022] Optionally, constructing load influencing factor characteristics based on the data type of the load influencing factor data includes:
[0023] The load influencing factors data include production equipment start-up and shutdown time data, calendar rule data, and temperature data;
[0024] The one-hot encoding method is used to process the start-up and stop time period data of the production equipment and the calendar rule data respectively to obtain the start-up and stop characteristics and time characteristics of the production equipment.
[0025] The temperature data were processed using cubic spline interpolation to obtain temperature characteristics.
[0026] Optionally, the step of predicting the target features in the corresponding input feature set using the constructed prediction sub-model for each fluctuation sub-sequence includes:
[0027] According to the preset feature removal rules, feature removal processing is performed on each input feature set to obtain the target feature set;
[0028] For each fluctuation subsequence, the corresponding target feature set is split into training sample data, test sample data, and validation sample data; a Bayesian optimized gradient boosting model is trained using the training sample data, and validated using the validation sample data to obtain the target model parameters of the gradient boosting model, so as to determine the prediction sub-model of the corresponding fluctuation subsequence based on the target model parameters.
[0029] For each fluctuation subsequence, the corresponding test sample data is input into the corresponding prediction sub-model for short-term load forecasting.
[0030] Optionally, the step of performing feature removal processing on each input feature set according to a preset feature removal rule to obtain a target feature set includes:
[0031] Calculate the contribution of each input feature in each input feature set separately;
[0032] For each input feature set, the contribution of each input feature is accumulated in descending order of contribution until the sum of the contributions is greater than the preset accumulated contribution threshold. Input features that have not been accumulated are deleted to obtain the corresponding initial target feature set.
[0033] For each initial target feature set, the correlation coefficient of each feature group consisting of two input features is calculated, and feature groups with correlation coefficients greater than a preset correlation coefficient threshold are removed to obtain the target feature set.
[0034] Another embodiment of the present invention provides an industrial enterprise power load forecasting device, comprising:
[0035] The data decomposition module is used to decompose the measured data set of the power load of the industrial enterprise to be predicted into multiple fluctuation subsequences.
[0036] The input feature construction module is used to generate a corresponding input feature set for each fluctuation subsequence based on the load influencing factor data and the load data feature set corresponding to each fluctuation subsequence.
[0037] The subsequence prediction module is used to predict the target features in the corresponding input feature set for each fluctuation subsequence using the constructed prediction sub-model;
[0038] The short-term load forecasting module is used to determine the load forecast value of the industrial enterprise under test based on the forecast results of each forecasting sub-model.
[0039] This invention also provides an electronic device, including a processor, which executes a computer program stored in a memory to implement the steps of the industrial enterprise power load forecasting method as described in any of the preceding claims.
[0040] Finally, this embodiment of the invention also provides a readable storage medium storing a computer program that, when executed by a processor, implements the steps of the industrial enterprise power load forecasting method as described in any of the preceding embodiments.
[0041] The advantage of the technical solution provided in this application lies in its ability to decompose nonlinear, time-varying, and non-stationary industrial enterprise load data into multiple fluctuating sub-sequences. This effectively captures the temporal correlation of industrial enterprise user load data and allows for deeper analysis of the regularity of the load data time series. Constructing a corresponding prediction model for each fluctuating sub-sequence and accumulating the prediction results of each model to obtain the final power load prediction result can significantly improve the accuracy and speed of power load prediction.
[0042] Furthermore, embodiments of the present invention also provide corresponding implementation devices, electronic devices, and readable storage media for the power load forecasting method for industrial enterprises, further making the method more practical. The devices, electronic devices, and readable storage media have corresponding advantages.
[0043] It should be understood that the above general description and the following detailed description are merely exemplary and do not limit this disclosure. Attached Figure Description
[0044] To more clearly illustrate the technical solutions of the embodiments of the present invention or related technologies, the drawings used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0045] Figure 1 A framework diagram of an industrial enterprise power load forecasting method provided in an embodiment of the present invention;
[0046] Figure 2 A flowchart illustrating an industrial enterprise power load forecasting method provided in an embodiment of the present invention;
[0047] Figure 3 A schematic diagram illustrating a short-term power load forecast comparison in an illustrative scenario provided by an embodiment of the present invention;
[0048] Figure 4 A structural diagram of a specific embodiment of the industrial enterprise power load forecasting device provided in this invention;
[0049] Figure 5 This is a structural diagram of a specific embodiment of the electronic device provided in this invention. Detailed Implementation
[0050] To enable those skilled in the art to better understand the present invention, the invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are merely some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0051] The terms "first," "second," "third," "fourth," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not limited to the listed steps or units, but may include steps or units not listed.
[0052] After introducing the technical solutions of the embodiments of the present invention, the various non-limiting embodiments of this application will be described in detail below.
[0053] First see Figure 1 , Figure 1 This is a framework diagram of an industrial enterprise power load forecasting method provided by an embodiment of the present invention. The embodiment of the present invention may include the following:
[0054] S101: Decompose the measured data set of the power load of the industrial enterprise to be predicted to obtain multiple fluctuation subsequences.
[0055] In this embodiment, the measured power load dataset consists of power load data from the industrial enterprise to be predicted at certain times within a certain period. The specific time period and times of the industrial enterprise to be predicted for power load data collection can be flexibly selected according to actual conditions, and this application does not impose any limitations on this. For example, this step can collect active power measurement data of the industrial enterprise to be predicted within a preset time period, such as six months, at a preset sampling frequency, such as every 15 minutes per day, and construct the measured power load dataset from these active power measurement data. This step can employ any signal decomposition method, such as empirical mode decomposition or variational mode decomposition, to decompose the data in the measured power load dataset that conforms to the time series, obtaining multiple fluctuating subsequences.
[0056] S102: Based on the load influencing factor data and the load data feature set corresponding to each fluctuation subsequence, generate a corresponding input feature set for each fluctuation subsequence.
[0057] The load influencing factor data in this embodiment refers to some external factors affecting the power load of industrial enterprises, such as, but not limited to, production equipment start-up and shutdown time data, calendar rule data, and temperature data, where the temperature data is the daily air temperature value. The load data feature set is composed of the load data corresponding to the fluctuation subsequence, that is, the load data feature set is generated by extracting the load measured data features of the fluctuation subsequence. The input feature set of each fluctuation subsequence is composed of the load influencing factor data and the load data feature set.
[0058] S103: For each oscillation subsequence, use the constructed prediction sub-model to predict the target features in the corresponding input feature set.
[0059] In this step, to adapt to different data types, the prediction sub-model needs to be trained before each prediction. Each fluctuation subsequence corresponds to a prediction sub-model, which is trained from features in the input feature set of the corresponding fluctuation subsequence. The target features in this step refer to features selected from the input feature set. These features are used simultaneously for training the model, validating the model, and for testing data. The test data refers to the test sample data used to evaluate the performance of the power load prediction method for industrial enterprises. The data used for training the model refers to the training sample data used during the training process of the prediction sub-model, and the data used for validating the model refers to the validation sample data used during the validation process of the prediction sub-model.
[0060] S104: Determine the load forecast value of the industrial enterprise to be measured based on the prediction results of each prediction sub-model.
[0061] In this step, the sum of the prediction results output by each prediction sub-model in the previous step can be used as the final predicted value of the power load of the industrial enterprise under test.
[0062] In the technical solution provided by this invention, the nonlinear, time-varying, and non-stationary industrial enterprise load data is decomposed to obtain multiple fluctuating sub-sequences. This effectively captures the temporal correlation of industrial enterprise user load data and allows for deeper exploration of the regularity of the load data time series. A corresponding prediction model is constructed for each fluctuating sub-sequence, and the prediction results of each prediction model are accumulated to obtain the final power load prediction result, which can greatly improve the prediction accuracy and speed of power load.
[0063] As an optional implementation, since decomposing the measured power load dataset using the variational mode decomposition method requires pre-setting parameters, this embodiment also provides an implementation method that automatically identifies and defines the decomposition layer values using the variational mode decomposition method, thus eliminating the need for predefined a priori modes, in order to improve overall efficiency and enhance user experience. Figure 2 As shown, the adaptive variational mode decomposition process may include: first, calculating the number of decomposition levels of the measured power load dataset using the empirical mode decomposition method; then, assigning the decomposition level number to the variational mode decomposition method, and finally using the variational mode decomposition method to decompose the measured power load dataset to obtain multiple fluctuation subsequences. For example, if the empirical mode decomposition method calculates the number of decomposition levels of the measured power load dataset to be 16, assigning 16 to the variational mode decomposition method will achieve adaptive decomposition of the load data, outputting 16 amplitude modulation and frequency modulation subsequences.
[0064] It is understandable that equipment start-up / shutdown, malfunctions, or data acquisition errors can lead to missing data points for one or more moments in the collected power load measurement dataset. To improve load forecasting, these missing data points can be filled in. For example, taking active power measurement data as the power load measurement dataset, if missing data exists, historical active power measurement data for the corresponding moment in the previous cycle is obtained. This historical active power measurement data is then used to fill in the missing data, generating the power load measurement dataset. For instance, receiving measured load data from an industrial enterprise in Changsha, China, from November 1, 2017 to June 31, 2019, with a 15-minute sampling interval, allows for the identification and filling of missing data, using load values from the same moment in the previous week.
[0065] In the above embodiments, there is no limitation on how to perform step S102. One implementation method in this embodiment may include the following steps:
[0066] Before executing step S102, load influencing factor data can be collected according to the actual application scenario. To facilitate subsequent model training and identification, load influencing factor features can be constructed based on the data type of the load influencing factor data after acquisition. For example... Figure 2 As shown, for load influencing factor data, including production equipment start-up and shutdown time periods, calendar rule data, and temperature data, one-hot encoding can be used to process the two types of discrete influencing factors: production equipment start-up and shutdown time periods and calendar rule data. By using one-hot encoding to process the production equipment start-up and shutdown time periods and calendar rule data respectively, the start-up and shutdown characteristics and time characteristics of the production equipment can be obtained. For temperature data, cubic spline interpolation can be used to process the temperature data to obtain temperature characteristics. For example, if 24-hour daily temperature data is obtained from the Resource and Environmental Science Data Center of the Chinese Academy of Sciences, with a data sampling interval of 1 hour, cubic spline interpolation can be used to fill in the temperature gaps, resulting in data with a sampling interval of 15 minutes. The construction method for load influencing factor characteristics can be referred to in Table 1, which includes: for production equipment start-up and shutdown time data and calendar rule data, one-hot encoding is used for processing. The production equipment start-up time characteristic O is 1, and the non-start-up time is 0; the production equipment start-up and shutdown time characteristic S is 1, and the non-shutdown time is 0; the weekend characteristic W is 1, and the non-weekend characteristic H is 1, and the non-holiday characteristic H is 0; for temperature data, cubic spline interpolation is used to fill in the gaps to form the temperature characteristic T. After constructing and processing each load influencing factor according to the method shown in Table 1, the load influencing factor characteristics can be expressed as I = [TWHOS], where T, W, H, O, and S represent the input characteristics of temperature, weekend, holiday, and production equipment start-up and shutdown time, respectively.
[0067] For each fluctuating subsequence, a load data feature set D can be constructed using the sliding window method. The corresponding input feature set is generated by adding the load influencing factor features I to the load data feature set of the current fluctuating subsequence. The load data features D and the relevant influencing factor features I of each subsequence together constitute the input feature F. k As shown in Table 2. L in Table 2 t-a A represents the load value at the a-th sampling point before the prediction time t. t-b This represents the average load of b sampling points before the predicted time t.
[0068] Table 1. Construction Method of Load Influencing Factors Characteristics
[0069]
[0070] Table 2 Input Feature Set Table
[0071]
[0072] In this embodiment, the impact of temperature, production equipment start-up and shutdown times, weekends, and holidays on the electricity consumption behavior of industrial enterprises is considered during the load forecasting process. The relevant influencing factors are transformed into input features of the model, which can more realistically predict the actual electricity consumption and improve the accuracy of power load forecasting for industrial enterprises.
[0073] In the above embodiments, there is no limitation on how to perform step S103. One implementation method in this embodiment may include the following steps:
[0074] According to the preset feature removal rules, feature removal processing is performed on each input feature set to obtain the target feature set;
[0075] For each fluctuation subsequence, the corresponding target feature set is split into training sample data, test sample data, and validation sample data. A Bayesian-optimized gradient boosting model is trained using the training sample data and validated using the validation sample data to obtain the target model parameters. These target model parameters are then used to determine the prediction sub-model for the corresponding fluctuation subsequence. The target model parameters are the optimal parameters of the gradient boosting model. The Bayesian optimization algorithm primarily optimizes five hyperparameters of the gradient boosting model: learning rate, maximum tree depth, sample sampling rate, feature sampling rate, and minimum leaf node sample weights.
[0076] For each fluctuation subsequence, the corresponding test sample data is input into the corresponding prediction sub-model for short-term load forecasting.
[0077] In this embodiment, an XGboost model (Extreme Gradient Boosting) is built for the selected input features and prediction is performed. During the prediction process, the hyperparameters of the XGboost model are optimized using a Bayesian optimization algorithm. The prediction results of each model are accumulated to obtain the load prediction value, which may include the following steps:
[0078] Step 4-1: Use 30% of the selected input features as test sample data F″ Xk The remaining 70% was divided into training sample data F″ using five-fold cross-validation. Tk and validation sample data F″ Vk .
[0079] Step 4-2: Using the training sample data F″ from Step 4-1 Tk Train a Bayesian-optimized XGBoost algorithm while utilizing the validation sample data F″ from step 4-1. Vk The optimal parameters for XGBoost are obtained through verification, and thus the trained XGBoost is obtained.
[0080] Set the candidate set S of hyperparameters. The learning rate parameter ranges from [0.001, 0.2], the maximum tree depth parameter ranges from {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}, the sample sampling rate parameter ranges from {0.7, 0.8, 0.9}, the feature sampling rate parameter ranges from {0.7, 0.8, 0.9}, and the minimum leaf node sample weight sum ranges from {1, 2, 3, 4, 5, 6}. Set the running parameters: silent mode is set to 1, the number of threads is set to -1, the seed of the random tree is set to 619, the number of iterations in the training process is set to 20000, and earlystopping_rounds is used to stop training. If there is no improvement after 200 consecutive iterations, the training is stopped early, and the model is selected. Utilize F″ from step 4-1. Tk The XGBoost model is input for training. During each training iteration, the hyperparameters of the XGBoost model are randomly selected from the candidate set S. The final training yields the predicted values of the trained model. This is achieved using F″ from step 4-1. Vk To verify the accuracy of the predicted values of the trained model, the mean absolute error (MAE) method was used.
[0081]
[0082] In the formula, n is the number of training samples; yVi Sample points i The training model predicts values and validation samples provide the true values. The MAE (Maximum Effectiveness) is validated until the algorithm reaches the required number of iterations, at which point the optimization ends, and the hyperparameter combination that minimizes the MAE is output. xi Finally, the optimal hyperparameter combination of the XGBoost model for each subsequence is obtained. xi As shown in Table 3, the XGBoost models corresponding to each trained subsequence are obtained.
[0083] Table 3 Optimal Hyperparameters of the XGBoost Model
[0084]
[0085]
[0086] Step 4-3: Utilize F″ from Step 4-1 Xk The XGBoost algorithm trained in step 4-2 is tested to obtain the load prediction value for the k-th subsequence. The final prediction value is obtained by summing the load prediction values of each XGBoost algorithm.
[0087]
[0088] In the formula: The input feature of the kth subsequence of variational mode decomposition is the predicted value of the XGBoost model, where K is the total number of fluctuation subsequences and k is the kth fluctuation subsequence.
[0089] As shown above, this embodiment uses the XGBoost model for prediction sub-model, and employs a Bayesian optimization algorithm to optimally select the XGBoost model's learning rate, maximum tree depth, sample sampling rate, feature sampling rate, minimum leaf node sample weights, and five hyperparameters, thus avoiding fitting problems. XGBoost features a tree model structure based on residual learning, exhibiting stronger adaptability in time series modeling problems and better capturing the temporal correlations of industrial enterprise user load data. Furthermore, XGBoost supports column sampling, making it more efficient for processing large amounts of load data. This solves the problems of overfitting, training difficulties, noise sensitivity, and weak big data processing capabilities of existing prediction models.
[0090] To further improve the accuracy of model training and enhance the accuracy and efficiency of short-term power load forecasting for industrial enterprises, this application may also perform a elimination operation on each input feature set, which may include:
[0091] Calculate the contribution of each input feature in each input feature set separately;
[0092] For each input feature set, the contribution of each input feature is accumulated in descending order of contribution until the sum of the contributions is greater than the preset accumulated contribution threshold. Input features that have not been accumulated are deleted to obtain the corresponding initial target feature set.
[0093] For each initial target feature set, the correlation coefficient of each feature group consisting of two input features is calculated, and feature groups with correlation coefficients greater than a preset correlation coefficient threshold are removed to obtain the target feature set.
[0094] In this embodiment, the selection or removal of input features is divided into two steps. First, the contribution of input features can be calculated using algorithms such as Random Forest, and the features are sorted from largest to smallest contribution value. The contribution values are then accumulated according to the sorting order, and a feature accumulation contribution threshold is set to remove features with low contribution values. For example, the preset accumulation contribution threshold is 0.99, and the Random Forest algorithm is used to calculate the input feature F. k The contribution values are shown in Table 4, and the features are sorted from largest to smallest. The contribution values are then accumulated according to the sorting order, with a threshold of 0.99 set for the accumulated contribution value. Input features A that have reached the threshold of 0.99 are removed. t-18 A t-16 H, the processed input features after removing low-contribution features are:
[0095]
[0096] Table 4 Input Features F k Contribution
[0097]
[0098]
[0099] The second approach uses Pearson correlation coefficient analysis to analyze the pairwise correlation between input features, setting correlation coefficient thresholds to eliminate redundant features. For example, the Feature Selector tool in Python can be used to analyze F... k Pearson correlation analysis was performed to form feature groups from pairwise pairs. A correlation coefficient threshold of 0.99 was set. For feature groups exceeding the correlation coefficient threshold, features with low contribution were removed, as shown in Table 5. The processed input features F after removing redundant features were obtained. k ′:
[0100]
[0101] Table 5 shows the features removed from feature groups with correlation coefficients exceeding 0.99.
[0102]
[0103]
[0104] To enable those skilled in the art to more clearly understand the technical solution of this application, an illustrative example is also provided, which may include the following:
[0105] A1: Collect load data from industrial enterprises and fill in the missing data to form dataset A.
[0106] A2: Collect influencing factor data, perform feature construction processing on the influencing factor data, and construct influencing factor feature I.
[0107] A3: Adaptive variational mode decomposition is used to decompose dataset A, resulting in K oscillation subsequences. Input features are constructed for each subsequence, and the input features are selected to remove features with low contribution and redundant features.
[0108] A4: Build XGBoost models for the selected input features and make predictions. During the prediction process, use the Bayesian optimization algorithm to optimize the hyperparameters of the XGBoost models. Accumulate the prediction results of each model to obtain the load prediction value.
[0109] This application receives measured load data from an industrial enterprise in Changsha, China, from November 1, 2017 to June 31, 2019. The load data sampling interval is 15 minutes to form dataset A in step A1. Based on the method of this embodiment, a power load forecast is performed for June 25, 2019. The forecast value is compared with the actual value as follows: Figure 3 As shown in the figure, the effectiveness of the technical solution of this application can be verified.
[0110] As can be seen from the above, this embodiment applies signal decomposition technology to time series forecasting. It uses variational mode decomposition technology to decompose nonlinear, time-varying, and non-stationary industrial enterprise load data. Combined with XGboost, it constructs multiple predictors to further explore the regularity of the load data time series, greatly improving the prediction accuracy and speed. Furthermore, it uses empirical mode decomposition technology to automatically identify and define the decomposition K value, thus eliminating the predefined a priori mode.
[0111] It should be noted that there is no strict order of execution for the steps in this application. As long as they conform to a logical order, these steps can be executed simultaneously or in a certain preset order. Figure 1 and Figure 2 This is just an illustrative example and does not mean that this is the only possible execution order.
[0112] This invention also provides a corresponding apparatus for the industrial enterprise power load forecasting method, further enhancing the practicality of the method. The apparatus can be described from both a functional module perspective and a hardware perspective. The industrial enterprise power load forecasting apparatus provided in this invention is described below, and the description of the industrial enterprise power load forecasting apparatus below corresponds to the industrial enterprise power load forecasting method described above.
[0113] From the perspective of functional modules, see Figure 4 , Figure 4 A structural diagram of an industrial enterprise power load forecasting device provided in an embodiment of the present invention is shown in one specific implementation. The device may include:
[0114] The data decomposition module 401 is used to decompose the measured data set of the power load of the industrial enterprise to be predicted, and obtain multiple fluctuation subsequences.
[0115] The input feature construction module 402 is used to generate a corresponding input feature set for each fluctuation subsequence based on the load influencing factor data and the load data feature set corresponding to each fluctuation subsequence.
[0116] The subsequence prediction module 403 is used to predict the target features in the corresponding input feature set for each fluctuation subsequence using the constructed prediction sub-model;
[0117] The short-term load forecasting module 404 is used to determine the load forecast value of the industrial enterprise to be measured based on the forecast results of each forecasting sub-model.
[0118] Optionally, in some embodiments of this example, the data decomposition module 401 can be used to: calculate the number of decomposition layers of the measured power load dataset using the empirical mode decomposition method; assign the number of decomposition layers to the variational mode decomposition method; and decompose the measured power load dataset using the variational mode decomposition method to obtain multiple fluctuation subsequences.
[0119] Optionally, in some other embodiments of this example, the input feature construction module 402 may be further used to: acquire load influencing factor data, construct load influencing factor features according to the data type of the load influencing factor data; construct a load data feature set for each fluctuation subsequence using the sliding window method; and for each fluctuation subsequence, put the load influencing factor features into the load data feature set of the current fluctuation subsequence to generate a corresponding input feature set.
[0120] As an optional implementation of the above embodiments, the input feature construction module 402 may further be used for: load influencing factor data including production equipment start-up and shutdown time period data, calendar rule data and temperature data; processing the production equipment start-up and shutdown time period data and calendar rule data respectively using the one-hot encoding method to obtain production equipment start-up and shutdown features and time features; and processing the temperature data using the cubic spline interpolation method to obtain temperature features.
[0121] As another optional implementation of the above embodiments, the device may further include a raw data acquisition and preprocessing module, which is used to acquire active power measurement data of the industrial enterprise to be predicted within a preset time period according to a preset sampling frequency; if there are missing data in the active power measurement data, the historical active power measurement data at the same time in the previous cycle corresponding to the missing data is obtained; and the missing data is filled in using the historical active power measurement data to generate a power load measured dataset.
[0122] Optionally, in other embodiments of this example, the subsequence prediction module 403 may further be used to: perform feature removal processing on each input feature set according to a preset feature removal rule to obtain a target feature set; for each fluctuating subsequence, split the corresponding target feature set into training sample data, test sample data, and validation sample data; train a Bayesian-optimized gradient boosting model using the training sample data, and validate it using the validation sample data to obtain the target model parameters of the gradient boosting model, so as to determine the prediction sub-model of the corresponding fluctuating subsequence based on the target model parameters; and input the corresponding test sample data into the corresponding prediction sub-model for short-term load prediction for each fluctuating subsequence.
[0123] As an optional implementation of this embodiment, the subsequence prediction module 403 may further include a data elimination unit. This unit is used to calculate the contribution of each input feature in each input feature set; for each input feature set, the contribution of each input feature is accumulated in descending order of contribution until the sum of the contributions is greater than a preset accumulated contribution threshold, and the input features that have not been accumulated are deleted to obtain the corresponding initial target feature set; for each initial target feature set, the correlation coefficient of the feature group formed by each pair of input features is calculated, and the feature group with a correlation coefficient greater than a preset correlation coefficient threshold is eliminated to obtain the target feature set.
[0124] The functions of each functional module of the industrial enterprise power load forecasting device described in this embodiment of the invention can be specifically implemented according to the methods in the above method embodiments. The specific implementation process can be referred to the relevant descriptions in the above method embodiments, which will not be repeated here.
[0125] As can be seen from the above, the embodiments of the present invention can effectively improve the accuracy of power load prediction for industrial enterprises.
[0126] The industrial enterprise power load forecasting device mentioned above is described from the perspective of functional modules. Furthermore, this application also provides an electronic device, which is described from the perspective of hardware. Figure 5 This is a schematic diagram of the structure of the electronic device provided in one embodiment of this application. For example... Figure 5 As shown, the electronic device includes a memory 50 for storing a computer program; and a processor 51 for executing the computer program to implement the steps of the industrial enterprise power load forecasting method mentioned in any of the above embodiments.
[0127] The processor 51 may include one or more processing cores, such as a quad-core processor or an octa-core processor. The processor 51 may also be a controller, microcontroller, microprocessor, or other data processing chip. The processor 51 may be implemented using at least one hardware form selected from DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array). The processor 51 may also include a main processor and a coprocessor. The main processor, also known as a CPU (Central Processing Unit), is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, the processor 51 may integrate a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, the processor 51 may also include an AI (Artificial Intelligence) processor, which is used to handle computational operations related to machine learning.
[0128] The memory 50 may include one or more computer-readable storage media, which may be non-transitory. The memory 50 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In some embodiments, the memory 50 may be an internal storage unit of an electronic device, such as a server hard drive. In other embodiments, the memory 50 may be an external storage device of an electronic device, such as a plug-in hard drive on a server, a Smart Media Card (SMC), a Secure Digital (SD) card, or a Flash Card. Furthermore, the memory 50 may include both internal and external storage units of the electronic device. The memory 50 can be used not only to store application software and various types of data installed on the electronic device, such as code for programs executing vulnerability handling methods, but also to temporarily store data that has been output or will be output. In this embodiment, the memory 50 is used to store at least the following computer program 501, which, after being loaded and executed by the processor 51, is capable of implementing the relevant steps of the industrial enterprise power load forecasting method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 50 may also include an operating system 502 and data 503, and the storage method may be temporary storage or permanent storage. The operating system 502 may include Windows, Unix, Linux, etc. The data 503 may include, but is not limited to, data corresponding to the power load forecast results of industrial enterprises.
[0129] In some embodiments, the aforementioned electronic device may further include a display screen 52, an input / output interface 53, a communication interface 54 (or network interface), a power supply 55, and a communication bus 56. The display screen 52 and input / output interface 53, such as a keyboard, are user interfaces; optional user interfaces may also include standard wired interfaces, wireless interfaces, etc. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, or an OLED (Organic Light-Emitting Diode) touchscreen, etc. The display may also be appropriately referred to as a display screen or display unit, used to display information processed in the electronic device and to display a visual user interface. The communication interface 54 may optionally include a wired interface and / or a wireless interface, such as a Wi-Fi interface, a Bluetooth interface, etc., typically used to establish communication connections between the electronic device and other electronic devices. The communication bus 56 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. This bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, Figure 5 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.
[0130] Those skilled in the art will understand that Figure 5 The structure shown does not constitute a limitation on the electronic device and may include more or fewer components than shown, such as sensors 57 that perform various functions.
[0131] The functions of each functional module of the electronic device described in the embodiments of the present invention can be specifically implemented according to the methods in the above method embodiments. The specific implementation process can be referred to the relevant descriptions in the above method embodiments, which will not be repeated here.
[0132] As can be seen from the above, the embodiments of the present invention can effectively improve the accuracy of power load prediction for industrial enterprises.
[0133] It is understood that if the industrial enterprise power load forecasting method in the above embodiments is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and executes all or part of the steps of the methods in the various embodiments of this application. The aforementioned storage medium includes: USB flash drive, mobile hard drive, read-only memory (ROM), random access memory (RAM), electrically erasable programmable ROM, register, hard disk, multimedia card, card-type memory (e.g., SD or DX memory), magnetic memory, removable disk, CD-ROM, magnetic disk or optical disk, and other media capable of storing program code.
[0134] Based on this, embodiments of the present invention also provide a readable storage medium storing a computer program, wherein the computer program, when executed by a processor, performs the steps of the industrial enterprise power load forecasting method described in any of the above embodiments.
[0135] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the hardware disclosed in the embodiments, including devices and electronic equipment, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple; relevant parts can be referred to the method section.
[0136] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.
[0137] The foregoing has provided a detailed description of an industrial enterprise power load forecasting method, apparatus, electronic device, and readable storage medium provided in this application. Specific examples have been used to illustrate the principles and implementation methods of the invention. The descriptions of the embodiments above are merely for the purpose of helping to understand the method and core ideas of the invention. It should be noted that those skilled in the art can make various improvements and modifications to this application without departing from the principles of the invention, and these improvements and modifications also fall within the protection scope of the claims of this application.
Claims
1. An industrial enterprise power load forecasting method, characterized by, The method comprises the following steps: decomposing a measured data set of power load of an industrial enterprise to be predicted to obtain a plurality of fluctuation sub-sequences; obtaining production equipment start-stop period data, calendar rule data and temperature data, and constructing load influencing factor features according to the data types to which the production equipment start-stop period data, the calendar rule data and the temperature data belong; constructing a load data feature set for each fluctuation sub-sequence by using a sliding window method, and putting the load influencing factor features into the load data feature set of the current fluctuation sub-sequence to generate a corresponding input feature set for each fluctuation sub-sequence; performing feature elimination processing on each input feature set according to a preset feature elimination rule to obtain a target feature set: calculating the contribution degrees of the input features in each input feature set respectively, and for each input feature set, sequentially adding the contribution degrees of the input features from large to small until the cumulative sum of the contribution degrees is greater than a preset cumulative contribution degree threshold, and deleting the input features that have not been added, to obtain a corresponding initial target feature set; for each initial target feature set, calculating the correlation coefficients of feature groups composed of each two input features, and eliminating the feature groups with correlation coefficients greater than a preset correlation coefficient threshold, to obtain the target feature set; for each fluctuation sub-sequence, predicting the target features in the corresponding input feature set by using the constructed prediction sub-model; determining the load prediction value of the industrial enterprise to be measured according to the prediction results of the prediction sub-models. Before decomposing the measured data set of power load of the industrial enterprise to be predicted, the method further comprises the following steps:
2. The industrial enterprise power load forecasting method of claim 1, wherein, collecting active power measurement data of the industrial enterprise to be predicted in a preset time period according to a preset sampling frequency; if there is missing data in the active power measurement data, obtaining historical active power measurement data at the same time point in the previous cycle corresponding to the missing data; filling in the missing data by using the historical active power measurement data to generate the measured data set of power load. According to the data types to which the production equipment start-stop period data, the calendar rule data and the temperature data belong, the load influencing factor features are constructed, which comprises the following steps:
3. The industrial enterprise power load forecasting method of claim 1, wherein, processing the production equipment start-stop period data and the calendar rule data by using a one-hot encoding method to obtain production equipment start-stop features and time features; processing the temperature data by using a cubic spline interpolation method to obtain temperature features. The prediction of the target features in the corresponding input feature set by using the constructed prediction sub-model for each fluctuation sub-sequence comprises the following steps:
4. The industrial enterprise power load forecasting method according to any one of claims 1 to 3, characterized by, For each fluctuation subsequence, the corresponding target feature set is split into training sample data, test sample data, and validation sample data; a Bayesian optimized gradient boosting model is trained using the training sample data, and validated using the validation sample data to obtain the target model parameters of the gradient boosting model, so as to determine the prediction sub-model of the corresponding fluctuation subsequence based on the target model parameters. For each fluctuation subsequence, the corresponding test sample data is input into the corresponding prediction sub-model for short-term load forecasting.
5. An industrial enterprise electric power load forecasting device characterized by comprising: include: The data decomposition module is used to decompose the measured data set of the power load of the industrial enterprise to be predicted into multiple fluctuation subsequences. The input feature construction module is used to obtain production equipment start-up and shutdown time period data, calendar rule data, and temperature data, and construct load influencing factor features according to the data types of the production equipment start-up and shutdown time period data, calendar rule data, and temperature data; The sliding window method is used to construct a load data feature set for each fluctuation subsequence. For each fluctuation subsequence, the load influencing factor features are put into the load data feature set of the current fluctuation subsequence to generate the corresponding input feature set. The subsequence prediction module is used to perform feature removal processing on each input feature set according to preset feature removal rules to obtain a target feature set: The contribution of each input feature in each input feature set is calculated. For each input feature set, the contribution of each input feature is accumulated in descending order of contribution until the sum of the contributions exceeds a preset accumulated contribution threshold. Input features that have not undergone accumulation are deleted to obtain the corresponding initial target feature set. For each initial target feature set, the correlation coefficient of each feature group formed by two input features is calculated, and feature groups with correlation coefficients greater than a preset correlation coefficient threshold are removed to obtain the target feature set. For each fluctuation subsequence, the constructed prediction sub-model is used to predict the target features in the corresponding input feature set. The short-term load forecasting module is used to determine the load forecast value of the industrial enterprise under test based on the forecast results of each forecasting sub-model. The data decomposition module is further configured to: calculate the number of decomposition layers of the measured power load dataset using the empirical mode decomposition method, assign the number of decomposition layers to the variational mode decomposition method, and decompose the measured power load dataset using the variational mode decomposition method to obtain multiple fluctuation subsequences.
6. An electronic device, comprising: It includes a processor and a memory, the processor being used to execute a computer program stored in the memory to implement the steps of the industrial enterprise power load forecasting method as described in any one of claims 1 to 4.
7. A readable storage medium, characterized by, The readable storage medium stores a computer program that, when executed by a processor, implements the steps of the industrial enterprise power load forecasting method as described in any one of claims 1 to 4.