A method and device for predicting large-scale power consumption load and a computer device
By classifying and clustering dedicated power transformer users at each level, a power load forecasting model is constructed, which solves the problems of low accuracy and high cost in existing power load forecasting technologies and achieves more efficient load forecasting.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA GRIDCOM
- Filing Date
- 2022-10-31
- Publication Date
- 2026-06-16
AI Technical Summary
In existing technologies, the accuracy of power load prediction models for dedicated power transformer users is low, and the cost and time cost of training the models are high, making it difficult to adapt to the differences in power load patterns among different users.
By classifying and clustering a large number of users step by step, users are grouped into categories with similar electricity load patterns, features strongly correlated with electricity load are constructed, and corresponding electricity load prediction models are trained.
It improved the accuracy of electricity load forecasting, reduced the number of training models, lowered costs, and accelerated model training.
Smart Images

Figure CN115712864B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of power technology, and in particular to a method, apparatus, and computer equipment for large-scale electricity load prediction. Background Technology
[0002] Dedicated transformers refer to a mode of power supply that uses a dedicated transformer independently. With the rise of enterprises, residential communities, and other high-power-consuming institutions, the number and scale of dedicated transformers are increasing daily. Load forecasting is one of the foundations for formulating power grid operation plans and carrying out various power grid operation and management tasks. Given the diverse load characteristics of dedicated transformer users and the temporal and spatial differences in their electricity demand, understanding the load characteristics of the distribution network through load forecasting and providing targeted guidance for power grid planning, construction, dispatching, and operation management is particularly necessary under the current development trend of distribution networks.
[0003] In related technologies, the optimal long short-term memory neural network model of the electricity user is generally constructed to predict the user's electricity consumption in the future, so as to obtain the load prediction results of users with different electricity consumption patterns.
[0004] However, since different dedicated power transformer users have different electricity load patterns, the accuracy of the prediction results of the electricity load prediction model in related technologies needs to be improved. Summary of the Invention
[0005] This invention aims to at least partially solve one of the technical problems in related technologies. Therefore, the first objective of this invention is to propose a method for generating large-scale electricity load forecasting data. For a large number of dedicated power transformer users, the method progressively categorizes users into groups with similar electricity load patterns. This ensures that users with similar electricity load patterns are grouped into the same category, reducing the number of training load forecasting models required; it also ensures that user data from the same category fits the load forecasting model more quickly during training, improving forecast accuracy.
[0006] The second objective of this invention is to propose a training method for a large-scale electricity load prediction model, which constructs features strongly correlated with electricity load based on users' basic electricity consumption time series data, thereby improving the accuracy of user load prediction.
[0007] The third objective of this invention is to provide a device for generating large-scale electricity load forecasting data.
[0008] The fourth objective of this invention is to provide a training device for a large-scale electricity load prediction model.
[0009] The fifth objective of this invention is to provide a computer device.
[0010] The sixth object of the present invention is to provide a computer-readable storage medium.
[0011] To achieve the above objectives, one embodiment of the present invention proposes a method for generating large-scale electricity load forecasting data. The method includes: determining a target pre-classification category of a target account and a target electricity cluster in which the target account belongs; wherein the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the electricity load time-series data of the electricity account corresponding to the target pre-classification category; each electricity cluster corresponds to an electricity load forecasting model; based on the target electricity cluster, searching among the several electricity load forecasting models corresponding to the target pre-classification category to obtain the target electricity load forecasting model corresponding to the target electricity cluster; and using the target electricity load forecasting model to predict the first historical electricity consumption data of the target account to generate the electricity load forecasting data; wherein the first historical electricity consumption data includes the electricity load time-series data of the target account and first reference factor data that can affect the electricity consumption of the target account.
[0012] According to one embodiment of the present invention, the method for determining the target pre-classification category of the target account includes: acquiring second electricity consumption history data of the target account; wherein the second electricity consumption history data includes electricity load time series data of the target account and second reference factor data that can affect the electricity consumption of the target account; the first reference factor data is different from the second reference factor data; inputting the second electricity consumption history data of the target account into a pre-classification time series model for category identification, and determining the target pre-classification category of the target account.
[0013] According to one embodiment of the present invention, before determining the target electricity cluster to which the target account belongs, the method further includes: determining the target electricity cluster from several electricity clusters corresponding to the target pre-classification category based on the electricity load time-series data of the target account; wherein the electricity load pattern of the electricity accounts in the target electricity cluster tends to be the same as or similar to the user load pattern of the target account; and assigning the target account to the target electricity cluster.
[0014] According to one embodiment of the present invention, the electricity load time series data corresponds to a basic feature dimension; the first reference factor data included in the first electricity historical data includes at least one of the following: solar term data in the seasonal feature dimension, holiday data in the holiday feature dimension, and festival data in the ethnic festival feature dimension.
[0015] According to one embodiment of the present invention, the second reference factor data included in the second electricity consumption history data includes at least one of: curve jitter data on the load curve feature dimension, first time-series correlation data on the first duration feature dimension, second time-series correlation data on the second duration feature dimension, and third time-series correlation data on the preset value feature dimension; wherein, the first duration and the second duration are not equal; the first time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the first duration; the second time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the second duration; and the third time-series correlation data is obtained by similarity calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the third duration.
[0016] According to one embodiment of the present invention, the target pre-classification category is any one of the following: horizontal user category, zero-value regularity user category, annual cycle user category, daily cycle user category, and random user category; wherein, the electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line; the proportion of zero values in the electricity load of the electricity account corresponding to the zero-value regularity user category exceeds a threshold and the non-zero values have a periodic regularity; the electricity load curve of the electricity account corresponding to the annual cycle user category has an annual periodic regularity; and the electricity load curve of the electricity account corresponding to the daily cycle user category has a daily periodic regularity.
[0017] One embodiment of the present invention proposes a method for training a large-scale electricity load prediction model. The method includes: acquiring time-series electricity load data and first reference factor data of electricity accounts in a target electricity cluster; wherein, the first reference factor data is data that can affect the electricity consumption of electricity accounts in the target electricity cluster; the target electricity cluster belongs to a target pre-classification category, and the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the time-series electricity load data of electricity accounts corresponding to the target pre-classification category; constructing several first electricity time-series data samples based on the electricity load time-series data and the first reference factor data; training a first initial prediction model using the several first electricity time-series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity cluster.
[0018] According to one embodiment of the present invention, the method for determining the target pre-classification category includes: acquiring second electricity consumption history data of the electricity account; wherein the second electricity consumption history data includes electricity load time series data of the electricity account and second reference factor data that can affect the electricity consumption of the electricity account; the first reference factor data is different from the second reference factor data; inputting the second electricity consumption history data of the electricity account into a pre-classification time series model for category identification, and determining the target pre-classification category of the electricity account.
[0019] According to one embodiment of the present invention, the generation method of the pre-classified time series model includes: acquiring electricity load time series data of a plurality of electricity accounts and second reference factor data of the plurality of electricity accounts; constructing a plurality of second electricity time series data samples based on the electricity load time series data of the plurality of electricity accounts and the second reference factor data of the plurality of electricity accounts; wherein the second electricity time series data samples have category description information; and inputting the second electricity time series data samples with category description information into a second initial prediction model for training to obtain the pre-classified time series model.
[0020] According to one embodiment of the present invention, the electricity load time series data corresponds to a basic feature dimension; the first reference factor data includes at least one of the following: solar term data in the seasonal feature dimension, holiday data in the holiday feature dimension, and festival data in the ethnic festival feature dimension.
[0021] According to one embodiment of the present invention, the second reference factor data included in the second electricity consumption history data includes at least one of: curve jitter data on the load curve feature dimension, first time-series correlation data on the first duration feature dimension, second time-series correlation data on the second duration feature dimension, and third time-series correlation data on the preset value feature dimension; wherein, the first duration and the second duration are not equal; the first time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the first duration; the second time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the second duration; and the third time-series correlation data is obtained by similarity calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the third duration.
[0022] One embodiment of the present invention provides a device for generating large-scale electricity load forecasting data. The device includes: a category cluster determination module, used to determine a target pre-classification category of a target account and a target electricity cluster in which the target account belongs; wherein the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the electricity load time-series data of the electricity account corresponding to the target pre-classification category; the electricity cluster corresponds to an electricity load forecasting model; a forecasting model search module, used to search among the several electricity load forecasting models corresponding to the target pre-classification category according to the target electricity cluster to obtain the target electricity load forecasting model corresponding to the target electricity cluster; and a forecasting data generation module, used to predict the first electricity historical data of the target account through the target electricity load forecasting model to generate the electricity load forecasting data; wherein the first electricity historical data includes the electricity load time-series data of the target account and first reference factor data that can affect the electricity consumption of the target account.
[0023] One embodiment of the present invention proposes a training device for a large-scale electricity load prediction model. The device includes: a data acquisition module, used to acquire time-series data of electricity load of electricity accounts in a target electricity cluster and first reference factor data; wherein, the first reference factor data is data that can affect the electricity consumption of electricity accounts in the target electricity cluster; the target electricity cluster belongs to a target pre-classification category, and the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the time-series data of electricity load of electricity accounts corresponding to the target pre-classification category; a sample construction module, used to construct several first electricity time-series data samples based on the electricity load time-series data and the first reference factor data; and a model training module, used to train a first initial prediction model using the several first electricity time-series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity cluster.
[0024] One embodiment of the present invention provides a computer device including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method described in any of the above embodiments.
[0025] One embodiment of the present invention provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the method described in any of the above embodiments.
[0026] According to multiple embodiments provided by the present invention, by first pre-classifying users based on the obvious periodicity of the electricity load curves of a large number of users, and then further clustering users based on the electricity consumption trend patterns of users in the same or similar electricity consumption cycles, a large number of users can be classified into categories with finer granularity step by step. This can effectively improve the accuracy of user classification, improve the prediction accuracy of the load prediction model, reduce the number of load prediction models that need to be trained, and improve the fitting speed of the prediction model training.
[0027] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description
[0028] Figure 1 This is a flowchart illustrating a method for generating large-scale electricity load forecasting data according to one embodiment of this specification.
[0029] Figure 2 This is a flowchart illustrating a method for determining a target pre-classification category according to one embodiment of this specification.
[0030] Figure 3 This is a flowchart illustrating a method for training a large-scale electricity load prediction model according to one embodiment of this specification.
[0031] Figure 4 This is a flowchart illustrating the generation method of a pre-classified time series model according to one embodiment of this specification.
[0032] Figure 5 This is a structural block diagram of a large-volume electricity load forecasting data generation device provided according to one embodiment of this specification.
[0033] Figure 6 This is a structural block diagram of a large-scale electricity load prediction model training device provided according to one embodiment of this specification.
[0034] Figure 7a This is a structural block diagram of a computer device provided according to one embodiment of this specification.
[0035] Figure 7b This is a structural block diagram of a computer device provided according to one embodiment of this specification. Detailed Implementation
[0036] The embodiments described in this specification are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain the invention, and should not be construed as limiting the invention.
[0037] Power transformer users (hereinafter referred to as "users") mainly include residential users, factory users, agricultural users, shopping mall (general) users, and enterprises and institutions with large electricity consumption. Different industries and types of users have different electricity consumption patterns, which vary considerably. In recent years, with the comprehensive coverage of smart meters and the development of information technology, the prediction of electricity load for transformer users has received increasing attention. Improving the load prediction technology level for power transformer users is beneficial for planned electricity management, rationally arranging grid operation modes and unit maintenance plans, and also for formulating reasonable power source construction plans, thereby improving the economic and social benefits of the power system. In related technologies, the electricity load prediction model is usually trained on the user to predict the user's electricity load over a future period. Training the electricity load prediction model for the user mainly includes the following three methods:
[0038] (1) The load forecasting model is trained separately based on the electricity load data of each user, so that each user has a corresponding load forecasting model for load forecasting. This method trains a model for each user, which has a high resource cost and time cost for model training. It is not feasible in practical applications, and the amount of historical data for a single user may not be enough. In particular, it is impossible to train a model for new users, or the model training for new users may overfit.
[0039] (2) Merge all user data and train a load forecasting model based on all user data so that each user has a corresponding load forecasting model for load forecasting. This method trains a single model for all users, and all users' load forecasts use this model. It does not take into account that different users have different electricity load patterns. A single forecasting model is difficult to learn the electricity load patterns of all users, resulting in low accuracy of load forecasting results using the model. In addition, the model is prone to underfitting during actual training, and the loss value of the model's loss function decreases very slowly or not at all.
[0040] (3) First, users are classified (or clustered) according to their electricity load patterns. Then, the data of users in the same category are merged, and the model is trained so that users with the same electricity consumption pattern have a corresponding load prediction model for load prediction of users under that electricity load pattern category. This method directly classifies (or clusters) users. Under the test of existing classification (or clustering) algorithms and historical electricity consumption data of users, the classification (or clustering) results are not ideal. As a result, the load prediction accuracy of the load prediction model trained based on the classification (or clustering) results is also not ideal. Moreover, the load prediction model is prone to underfitting during the training process, and the loss value decreases slowly during the model training process.
[0041] To improve the accuracy of electricity load forecasting for a large number of users, it is necessary to provide a method for large-scale electricity load forecasting. This method first categorizes users into different electricity load pattern classifications based on their cyclical characteristics. Then, based on the trend of users' electricity load curves between similar phases of similar periods, users in different electricity load pattern classifications are further subdivided into subcategories within the corresponding electricity load pattern classifications. This ensures that users in subcategories, with similar or identical cyclical characteristics of their electricity load curves, exhibit similar or identical electricity load trends within similar electricity consumption periods. This approach addresses the problem of poor results from directly classifying (or clustering) electricity users, effectively improving the accuracy of large-scale user classification. Then, user data within the same subcategory is merged, and user load forecasting features are constructed. These features serve as input to the electricity load forecasting model corresponding to that subcategory, training and obtaining an electricity load forecasting model for all users in that subcategory. Finally, the trained electricity load prediction model can be used to predict the electricity load of users classified into corresponding subcategories over a future period, improving the accuracy of the prediction results and increasing the fitting speed of the load prediction model training, while reducing the cost of training a load prediction model for each user. Specifically, user load prediction features represent characteristics strongly correlated with user electricity load, which can further improve the accuracy of user load prediction.
[0042] This specification provides a method for generating large-scale electricity load forecasting data, referencing... Figure 1 As shown, the method may include the following steps.
[0043] S110. Determine the target pre-classification category of the target account and the target electricity cluster in which the target account is located.
[0044] Among them, the target electricity consumption cluster is one of several electricity consumption clusters corresponding to the target pre-classification category; several electricity consumption clusters are obtained by clustering the electricity load time series data of the electricity accounts corresponding to the target pre-classification category; and each electricity consumption cluster corresponds to an electricity load prediction model.
[0045] The target account refers to the electricity account of a dedicated power transformer user whose electricity load pattern can be determined through classification and clustering, so that the electricity load in the future can be predicted based on the electricity load pattern category. The pre-classification category is the electricity load pattern category set based on the characteristics presented by the electricity load curve of the user's electricity account, which is used to initially classify the electricity account.
[0046] In some cases, dedicated power transformer users can include electricity users from different industries, types, and regions, such as residential users, factory users, and agricultural users, whose electricity consumption patterns vary greatly. For example, heating companies typically have higher electricity loads during the winter, while restaurants typically have higher loads at noon and in the evening; cold storage and preservation companies usually have relatively stable electricity loads over longer periods, and agricultural production users' electricity consumption cycles change with the crop cycle, and so on. In order to accurately and effectively predict the electricity load of users with different electricity consumption patterns in the future, it is necessary to classify users as accurately as possible according to their electricity consumption patterns.
[0047] Specifically, the pre-classification category can be an electricity load cycle category set based on the periodic characteristics presented by the electricity load curve of the electricity account. For example, the pre-classification category may include an electricity load cycle category where the electricity load remains basically unchanged over a long period, an electricity load cycle category where the electricity load is high only in a few months and not in other months, an electricity load cycle category where the annual electricity load changes according to a similar cycle, and an electricity load cycle category where the daily electricity load changes according to a similar cycle, etc. Based on the electricity load data of the target account and the corresponding periodic characteristics of the electricity load, the target account is initially classified to determine the electricity load cycle category of the target account.
[0048] Electricity consumption clusters are further categorized electricity consumption accounts that have been assigned to pre-classified categories. Within a cluster, user electricity consumption accounts share similar or identical electricity load cycles and trends, while user electricity consumption accounts in different clusters exhibit significant differences in their electricity load cycles and trends. A target electricity consumption cluster can be the cluster containing electricity consumption accounts in the pre-classified categories that have a high degree of similarity to the target account's electricity load cycle and trend.
[0049] In some cases, electricity load curves exhibit similar or identical periodicity characteristics, but may display different trends. For example, among residential users whose annual electricity load changes in a similar cycle with seasonal variations, southern residents typically consume more electricity in summer, while northern residents typically consume more in winter. Similarly, among users whose daily electricity load changes in a similar cycle with varying work and rest schedules, restaurants typically consume more electricity at midday and in the evening, while businesses typically consume more during the day, and so on. Therefore, it is understandable that pre-classified categories can correspond to several electricity consumption clusters. To balance the periodicity and trend of user electricity load patterns, it is necessary to determine the electricity load cycle category of each user's electricity account, and further determine the cluster of electricity accounts within that category that has the highest similarity to the user's electricity account in the same or similar electricity consumption cycles, thereby improving the accuracy of user classification.
[0050] Specifically, an electricity consumption cluster can be an electricity account cluster that further classifies electricity accounts within a pre-classified category based on the similarity of their electricity load time series. Within the pre-classified category of the target account, the similarity between the target account's electricity load time series and the electricity load time series of user electricity accounts in several electricity consumption clusters is calculated to determine the target electricity consumption cluster to which the target account belongs. In some embodiments, the electricity consumption cluster containing the account with the highest similarity to the target account's electricity load time series can be considered the target electricity consumption cluster for that target account. In other embodiments, the electricity consumption cluster with the highest average similarity to the target account's electricity load time series can be considered the target electricity consumption cluster for that target account.
[0051] The electricity consumption clusters are obtained by clustering the electricity load time-series data of the electricity accounts corresponding to the target pre-classification categories. The electricity load time-series data can be historical electricity load time-series data of user electricity accounts classified into the same pre-classification category during the same or similar periods. In some embodiments, the electricity load time-series data can be an electricity load time series.
[0052] Specifically, the similarity of historical electricity load time series of electricity accounts within the same pre-classification category in the same or similar periods can be calculated, and clustering can be performed. Electricity accounts corresponding to historical electricity load time series with high similarity can be grouped into one cluster, thus obtaining several electricity clusters under that pre-classification category. In some embodiments, the time series similarity between the historical electricity load time series of every two electricity accounts in the pre-classification category can be calculated using the DTW (Dynamic Time Warping) algorithm, forming a similarity matrix, and clustering can be performed on the electricity accounts under that pre-classification category based on this similarity matrix. In some embodiments, the k-means clustering algorithm can be used for clustering to maximize the similarity of electricity load time series between electricity accounts within the final electricity cluster and minimize the similarity of electricity load time series between electricity accounts in different electricity clusters, ensuring that the electricity load patterns of the electricity accounts within the electricity cluster tend to be the same or similar. Through the above clustering process, several electricity clusters corresponding to all pre-classification categories can be obtained. By first pre-classifying users based on the obvious periodicity of their electricity load curves, and then further clustering them based on the electricity consumption trends observed in the same or similar electricity consumption cycles, a large number of users can be classified into more granular categories, effectively improving the accuracy of user classification.
[0053] Each electricity consumption cluster corresponds to an electricity load forecasting model, which can be a time-series forecasting model trained using the electricity load time-series data of the electricity accounts within the cluster. Since the electricity load cycles and trends of user accounts within the same electricity consumption cluster are similar or identical, training an electricity load forecasting model for each cluster using the electricity load time-series data of the accounts within the cluster allows for the prediction of electricity load for accounts that can be classified into the application cluster over a future period, thereby improving the accuracy of electricity load forecasting for users with different electricity load patterns.
[0054] S120. Based on the target electricity consumption cluster, search among several electricity load prediction models corresponding to the target pre-classification category to obtain the target electricity load prediction model corresponding to the target electricity consumption cluster.
[0055] The target electricity load forecasting model is a predictive model used to generate electricity load forecast data for the target account over a future period. Since each pre-classification category corresponds to several electricity consumption clusters, and each cluster corresponds to an electricity load forecasting model, it can be understood that each pre-classification category corresponds to several electricity load forecasting models. Specifically, after determining the target pre-classification category and target electricity consumption cluster to which the target account belongs, the target pre-classification category corresponds to several electricity load forecasting models. The target electricity load forecasting model can be determined based on the target electricity consumption cluster among the several electricity load forecasting models corresponding to the target pre-classification category.
[0056] S130. The first historical electricity consumption data of the target account is predicted by the target electricity load prediction model to generate electricity load prediction data.
[0057] The first set of historical electricity consumption data includes the time-series data of the target account's electricity load, as well as the data of the first reference factors that can affect the target account's electricity consumption.
[0058] The first historical electricity consumption data is the historical electricity load time series data of the target account, which includes the corresponding user electricity load pattern characteristics, and serves as the input data of the electricity load prediction model in this specification embodiment; the electricity load prediction data is the possible electricity load time series data of the target account in the future period, which is the output data of the target electricity load prediction model, and the electricity load pattern followed by the possible electricity load time series data is the same as or similar to the electricity load pattern of the target account.
[0059] The electricity load time-series data can be the actual electricity load time series of a user's electricity account, possessing basic electricity load characteristics such as electricity consumption and usage time. The first reference factor data can be data reflecting influencing factors strongly correlated with changes in the user's electricity load trend, having a significant impact on the user's electricity load, such as weekdays and holidays. The electricity load time-series data and the first reference factor data constitute the first historical electricity load data, serving as input data for the target electricity load prediction model. In some cases, the influencing factors include date information, weather information, and holiday information that are highly correlated with the user's electricity load. For example, electricity load typically increases during summer solar terms in the 24 solar terms; the electricity load patterns during holidays such as Spring Festival and National Day usually differ from those on weekdays, and so on. Therefore, based on the actual electricity load time-series data, it is necessary to utilize data reflecting influencing factors strongly correlated with the user's electricity load as supplementary features of the user's electricity load patterns, thereby improving the accuracy of user load prediction.
[0060] Specifically, after obtaining the target electricity load prediction model, the historical electricity load time series of the target account and the data of the corresponding influencing factors contained in the historical electricity load time series are used as inputs to the target load prediction model to obtain the possible electricity load time series data of the target account in the future.
[0061] In the above embodiments, by performing a hierarchical classification method of first classifying and then clustering a large number of users, the accuracy of user classification can be improved, ensuring that users with the same or similar electricity load patterns are classified into the same category, thus reducing the number of training electricity load prediction models. At the same time, it can also ensure that when training electricity load prediction models for users of the same category, the model fitting speed is faster, thereby improving the accuracy of prediction.
[0062] In some implementations, reference Figure 2 As shown, the method for determining the target pre-classification category of a target account may include the following steps.
[0063] S210, Obtain the second electricity usage history data of the target account.
[0064] The second set of historical electricity consumption data includes the time-series data of the target account's electricity load, as well as the data of the second reference factors that can affect the target account's electricity consumption; the data of the first reference factors is different from the data of the second reference factors.
[0065] The second historical electricity consumption data refers to the historical electricity load time-series data of the target account, which includes the corresponding historical electricity load pattern characteristics of the user. In this embodiment, it serves as input data for the pre-classification time-series model. The second reference factor data can be calculated based on the actual electricity load time-series data of the user's electricity account. This data can influence the periodicity of the electricity load pattern and has a strong correlation with the data characteristics of the pre-classification category. It can be used to identify the pre-classification category to which the user's electricity account belongs based on the electricity load pattern. The electricity load pattern of a user's electricity account is related to factors such as the user's industry, electricity consumption habits, and work / rest schedule. For example, the electricity load of users such as ice-making enterprises, heating enterprises, and agricultural production enterprises usually shows a significant seasonal variation; the electricity load of users such as residents and enterprises usually shows a significant variation with work / rest schedule, and so on. It should be noted that the second reference factor data influencing the periodicity of the user's electricity load pattern is different from the first reference factor data influencing the trend change of the user's electricity load.
[0066] S220. Input the second electricity consumption history data of the target account into the pre-classification time series model for category identification and determine the target pre-classification category of the target account.
[0067] The pre-classification time series model can be a time series classification model that determines the pre-classification category to which an electricity account belongs. The time series classification model can perform a preliminary classification of electricity accounts based on the user's actual electricity load time series data and the calculated corresponding user electricity load pattern feature data. Specifically, second historical electricity consumption data, containing the actual electricity load time series of the target account and the calculated electricity load pattern feature data, is used as input data and fed into the pre-classification time series model. The pre-classification time series model outputs the pre-classification category of the target account. For example, the pre-classification time series model can be a classification model based on supervised learning.
[0068] The above implementation method calculates feature data related to the characteristics of pre-classified category data, and uses this feature data to classify and identify user electricity accounts, thereby improving the accuracy of user pre-classification.
[0069] In some implementations, before determining the target electricity cluster in which the target account is located, the electricity load forecasting data generation method may further include: determining the target electricity cluster from among several electricity clusters corresponding to the target pre-classification category based on the electricity load time series data of the target account.
[0070] Among them, the electricity load patterns of the electricity accounts in the target electricity cluster are similar to or close to the user load patterns of the target accounts; the target accounts are assigned to the target electricity clusters.
[0071] Specifically, after determining the target pre-classification category to which the target account belongs, the similarity between the electricity load time series of the target account and the electricity load time series of each account in several electricity clusters corresponding to the target pre-classification category is calculated. Based on the similarity, a target electricity cluster is determined among the several electricity clusters corresponding to the target pre-classification category, and the target account is assigned to the target electricity cluster. For example, the correlation between the electricity load time series of the target account and the electricity load time series of each account in several electricity clusters corresponding to the target pre-classification category can be calculated using the DTW algorithm to measure the similarity between the electricity load time series of the target account and the electricity load time series of the accounts in several electricity clusters corresponding to the target pre-classification category. In some embodiments, the correlation between the electricity load time series of the target account and the electricity load time series of each account in the target pre-classification category can be calculated separately. Based on the calculated maximum correlation, the electricity cluster containing the corresponding account is determined as the target electricity cluster of the target account, and the target account is placed in that target electricity cluster.
[0072] In other embodiments, the similarity between the electricity load time series of the target account and the electricity load time series of the accounts in each electricity cluster can be calculated separately, and the average similarity between the target account and the accounts in each electricity cluster can be calculated to obtain the average similarity between the target account and each electricity cluster. Based on the calculated maximum average similarity, the corresponding electricity cluster is determined as the target electricity cluster of the target account, and the target account is placed in the target electricity cluster.
[0073] It should be noted that the embodiments in this specification measure the similarity of electricity load time series between electricity accounts to assess the similarity of electricity load trends within the same or similar electricity consumption cycles for electricity accounts under the same electricity load pattern category, thereby ultimately measuring the similarity of electricity load patterns among electricity accounts. Since the electricity accounts in an electricity cluster have similar or identical electricity load patterns, the target account and the electricity accounts in the target electricity cluster also have similar or identical electricity load patterns.
[0074] In some implementations, the electricity load time series data corresponds to the basic feature dimension; the first reference factor data included in the first electricity consumption history data includes at least one of the following: solar term data in the seasonal feature dimension, holiday data in the holiday feature dimension, and festival data in the ethnic festival feature dimension.
[0075] The basic feature dimension is used to represent the basic electricity metering dimension corresponding to the actual electricity load time series of the electricity account. For example, the basic feature dimension may include actual electricity consumption, electricity consumption time, and electricity consumption date.
[0076] Specifically, historical electricity load data for each electricity account is extracted to construct the time-series electricity load data for that account. For example, the electricity load data for August for each account is E = [E1, E2, E3, ..., E...]. 31 For example, E n (n represents the date, n≤31) represents the daily electricity load data for each date in the month. The electricity load data is extracted in 7-day cycles. The electricity load data from August 1st to August 7th can be extracted as one electricity load time series data line D1=[E1, E2, E3, E4, E5, E6, E7], the electricity load data from August 2nd to August 8th can be extracted as one electricity load time series data line D2=[E2, E3, E4, E5, E6, E7, E8], the electricity load data from August 3rd to August 9th can be extracted as one electricity load time series data line D3=[E3, E4, E5, E6, E7, E8, E9], and the electricity load data from August 4th to August 10th can be extracted as one electricity load time series data line D4=[E4, E5, E6, E7, E8, E9, E 10], and so on, until finally 25 time-series electricity load data points for this electricity account in August can be extracted, along with the daily electricity load data E for each day. n This is an element within a corresponding time-series electricity load data. Specifically, it extracts the daily electricity load data E for each day. n This is a daily electricity load time series composed of load data recorded at various recording points each day. For example, the recording frequency of user electricity load data can be 15 minutes, meaning there are 96 recording points per day. The extracted daily user electricity load data E... n The daily electricity load time series consists of data recorded at these 96 recording points.
[0077] Among them, the seasonal feature dimension is used to represent the climatic factors that affect the changes in electricity load of the electricity account, and the solar term data can be the 24 solar term feature data corresponding to the electricity load time series data of the electricity account; the holiday feature dimension is used to represent the holiday or workday factors that affect the changes in electricity load of the electricity account, and the holiday data can be the workday, weekend, holiday, and adjusted workday feature data corresponding to the electricity load time series data of the electricity account; the ethnic festival feature dimension is used to represent the ethnic minority festival factors that affect the changes in electricity load of the electricity account in areas where ethnic minorities are concentrated, and the festival data can be the ethnic minority festival feature data corresponding to the electricity load time series data of the electricity account in areas where ethnic minorities are concentrated.
[0078] The primary reference data includes at least one of the following: solar term data (seasonal characteristics), holiday data (holiday characteristics), and festival data (ethnic festival characteristics). This data is used to supplement the basic characteristic dimensions of the actual electricity load time series of electricity accounts. In some cases, the electricity load of an account is affected by factors such as date, climate, and holidays. For example, electricity load during summer and winter seasons is usually higher than during spring and autumn seasons; daytime electricity load on weekends and holidays may be higher than daytime electricity load on weekdays, or it may be very low on weekends and holidays. Therefore, it is necessary to supplement the basic electricity metering dimensions of electricity accounts with characteristic dimensions that reflect information such as date, climate, and holidays, to describe the user's electricity load from multiple perspectives, thereby improving the accuracy of user electricity consumption forecasts. Since the 24 solar terms contain date and climate information that are highly correlated with electricity load, holidays contain date and festival information that are highly correlated with electricity load, and ethnic minority festivals contain ethnic minority festival information that is highly correlated with the electricity load of ethnic minority areas and users, this specification introduces seasonal feature dimensions related to the 24 solar terms, holiday feature dimensions related to holidays, and ethnic minority festival feature dimensions related to ethnic minority festivals in its implementation method to improve the accuracy of electricity load forecasting.
[0079] Specifically, this section explains how to construct solar term data based on seasonal characteristics. For example, a list of length 24 can be used as the solar term data. Each element in the list corresponds to a solar term, and the value of the element represents the time difference between the date corresponding to the daily electricity load time series and the two preceding and following solar terms. The larger the time difference, the smaller the value. For instance, if a day falls exactly between the 4th solar term, Spring Equinox, and the 5th solar term, Qingming, then its solar term data is [0, 0, 0, 0.5, 0.5, 0, 0, ...]; if a day happens to be the 4th solar term, Spring Equinox, then its solar term data is [0, 0, 0, 1, 0, 0, 0, ...].
[0080] Specifically, this section explains how to construct holiday data based on holiday feature dimensions. For example, holidays can be divided into weekdays, weekends, public holidays, and adjusted workdays according to their nature, and the one-hot encoding method can be used to construct the holiday data. For the holiday feature: ["weekday", "weekend", "public holiday", "adjusted workday"], following the principle of encoding N states using an N-bit state register, the holiday feature contains 4 states, so the corresponding holiday data after encoding is [1000, 0100, 0010, 0001]. For example, if a day is a weekday, its holiday data is 1000; if a day is a weekend, its holiday data is 0100; if a day is a public holiday, its holiday data is 0010; and if a day is an adjusted workday, its holiday data is 0001.
[0081] Specifically, this section explains how to construct festival data based on the characteristic dimensions of ethnic festivals. For example, taking the Hui people of Ningxia, their minority festivals include Eid al-Fitr, Eid al-Adha, and Mawlid. The corresponding festival data can be constructed using one-hot encoding, resulting in the Ningxia Hui ethnic minority festival data as [100, 010, 001]. For instance, Eid al-Fitr falls on the first day of the tenth month of the Islamic calendar, so its festival data is 100; Eid al-Adha is seventy days later, so its festival data is 010; and Mawlid falls on the twelfth day of the third month of the Islamic calendar, so its festival data is 001.
[0082] By constructing a 24-solar-term feature system containing rich and accurate time and weather information, the accuracy of user load forecasting can be improved. Simultaneously, by constructing a feature system for ethnic minority festivals to supplement the traditional holiday feature system, the accuracy of user load forecasting in areas with large ethnic minority populations can be improved.
[0083] Based on the basic time series data, the above implementation method introduces relevant features according to the influencing factors that are strongly correlated with changes in user electricity load, and constructs feature data on the corresponding feature dimensions to describe user electricity load from multiple aspects, thereby improving the accuracy of user load forecasting.
[0084] In some implementations, the second reference factor data included in the second electricity consumption history data includes at least one of the following: curve jitter data on the load curve characteristic dimension, first time-series correlation data on the first duration characteristic dimension, second time-series correlation data on the second duration characteristic dimension, and third time-series correlation data on the preset value characteristic dimension.
[0085] The first and second time periods are not equal; the first time-series related data is obtained by calculating the correlation between the electricity load time-series data within the first time period and the sub-time-series data within the first time period based on the preset time period; the second time-series related data is obtained by calculating the correlation between the electricity load time-series data within the second time period and the sub-time-series data within the second time period based on the preset time period; and the third time-series related data is obtained by calculating the similarity between the electricity load time-series data within the third time period and the sub-time-series data within the third time period based on the preset time period.
[0086] The load curve feature dimension represents the amplitude and frequency of fluctuations in the user's electricity load curve over a period of time. Curve jitter data measures the amplitude and frequency of fluctuations in the user's electricity load curve and serves as the feature value of the load curve feature dimension. For example, the load curve feature can be a (near) horizontal curve feature, indicating that the user's electricity load has virtually no fluctuations or very small fluctuations over a period of time, and the corresponding electricity load curve is a horizontal straight line, or approximately a horizontal straight line.
[0087] Specifically, this section explains how to obtain curve jitter data. For example, taking a user's electricity load pattern as having a (near) horizontal curve characteristic, one month's historical electricity load time-series data for the user's electricity account can be obtained. The amplitude and frequency of fluctuations in this historical electricity load time-series data within one month can be calculated. In some embodiments, the mean of the one-month historical electricity load time-series data can be calculated, and the standard deviation (std) can be calculated based on this mean. The standard deviation rate (cv) of the historical electricity load time-series data, cv = std / mean, can then be used to measure the fluctuation of the electricity load within one month. In other embodiments, the user may not use electricity at certain times, resulting in some zero values in the user's electricity account's electricity load time-series data. To improve the accuracy of the characteristic description of the user's electricity consumption pattern and make subsequent pre-classification results more accurate, the proportion of zero values in the user's electricity account's electricity load time-series data and the standard deviation rate can be used to construct curve jitter data, which serves as the feature value of the load curve feature dimension.
[0088] The duration feature dimension represents the periodic duration characteristic corresponding to the user's electricity load variation pattern. Specifically, the duration in the duration feature dimension can include annual, quarterly, monthly, weekly, and daily cycles. It should be noted that the first duration of the first duration feature dimension is not equal to the second duration of the second duration feature dimension. For example, if a user uses electricity according to similar patterns each year—for instance, the user uses more electricity and has a higher load value in winter and summer, and less electricity and a lower load value in spring and autumn—then the duration feature corresponding to the user's electricity load can be an annual cycle feature. If a user uses electricity according to similar patterns each day—for instance, the user uses more electricity and has a higher load value in the morning and afternoon, and less electricity and a lower load value in the evening and early morning—then the duration feature corresponding to the user's electricity load can be a daily cycle feature.
[0089] The first time-series correlation data can be used to measure the similarity of the electricity load time-series data of users with a first duration characteristic changing in units of the first duration. For example, taking the first duration as an annual cycle, the first time-series correlation data can be used to measure the similarity of a user's electricity load time-series data for any given year to the electricity load time-series data for other years.
[0090] The second time-series correlation data can be used to measure the similarity of the electricity load time series of users with a second time-length characteristic changing in units of the second time-length. For example, taking the second time-length as a daily cycle, the second time-length correlation data is used to measure the similarity of the electricity load time-series data of a user on any given day to the electricity load time-series data of other days.
[0091] The preset value feature dimension is used to represent preset value features set based on the electricity load values that account for a large proportion of the user's electricity load time-series data. It should be noted that under preset value features, non-preset values in the user's electricity load time-series data exhibit regularity. For example, the preset value feature can be a zero-value regularity feature, indicating that the user's electricity usage time accounts for a large proportion, and the electricity load during other usage times exhibits periodicity. The corresponding electricity load time-series data shows a large proportion of zero values and periodicity of non-zero values. For example, this periodicity could be annual, quarterly, monthly, weekly, or daily.
[0092] Third-order time-series correlation data can be used to measure the similarity of user electricity load time series with preset value characteristics. For example, taking a preset value characteristic as a zero-value regularity characteristic, and assuming that the non-zero values in the user electricity load time series data exhibit annual periodicity, the third-order time-series correlation data is used to measure the similarity of a user's electricity load time series data for any given year to electricity load time series data from other years.
[0093] The preset time period can be the duration for collecting historical electricity load time-series data from a user. Specifically, the preset time period can be determined based on the periodicity of the user's electricity load time-series data. In some embodiments, for users whose electricity load patterns exhibit annual cycle characteristics or zero-value regularity characteristics, the historical electricity load time-series data from the past m years (m≥2) can be collected; for users with daily cycle characteristics or (near) horizontal curve characteristics, the historical electricity load time-series data from the past several days, one month, one quarter, half a year, or one year can be collected. In other embodiments, it can be uniformly set to collect the historical electricity load time-series data from the past few years or all of the user's data. It should be noted that the preset time period can be determined based on accuracy requirements, efficiency requirements, actual data volume, and other factors.
[0094] The third duration is used to represent the periodic duration corresponding to the periodicity of non-zero values in the electricity load time series data of users whose electricity load patterns have preset value characteristics. For example, it can be an annual cycle, a quarterly cycle, a monthly cycle, a weekly cycle, or a daily cycle.
[0095] Sub-time series data refers to several time series data obtained by dividing the user's electricity load time series data into corresponding time-duration units. For example, taking the first time-duration feature as an annual cycle feature, the user's historical electricity load time series data for the past two years can be taken, and the electricity load time series data can be divided into annual units to obtain two historical electricity load sub-time series data for the user with a time length of one year.
[0096] Correlation calculation is used to determine the degree of similarity between sub-time-series data of user electricity load, serving as a feature value for the duration dimension to measure the trend changes of user electricity load between similar phases over several cycles. Specifically, taking an annual cycle characteristic of user electricity load as an example, the correlation between the user's electricity load time-series data for the second year and the first year can be calculated to reflect the differences in electricity consumption, duration, and timing between the second and first years during the same or similar electricity consumption periods in these two years. This allows for further measurement of the trend changes of the user's electricity load between similar phases in these two annual cycles.
[0097] Specifically, this section explains how to perform correlation calculations to obtain the first time-series correlation data. For example, taking a user's electricity load pattern with an annual cycle as an example, we take the user's historical electricity load time-series data for m years (m≥2), including the historical electricity load time series for each day of those m years. First, we take the average value of the electricity data in the historical electricity load time series for each day of those m years to represent the historical electricity load data for each corresponding day. For example, we record the user's electricity load data at a recording frequency of 15 minutes. The daily electricity load time series contains 96 electricity data points recorded at 96 recording points. We take the average value of these 96 electricity data points to represent the electricity load data for that day. Based on the calculated daily electricity load data, we can obtain a new historical electricity load time series for the user with a time length of m years. Dividing this newly obtained historical electricity load time series into annual units yields m sub-time series with a time length of one year. The correlation between each pair of these m sub-time series is calculated, and the average correlation is obtained by averaging them, which serves as the feature value of the user's annual cycle characteristic dimension. In some embodiments, the correlation between sub-time series can be calculated using the DTW algorithm. In some embodiments, to eliminate the influence of some large short-term fluctuations in the user's electricity load time series data on the measurement of the trend change of the user's electricity load over several periods, the newly obtained historical electricity load time series can be smoothed and filtered before correlation calculation.
[0098] Specifically, this section explains how to perform correlation calculations to obtain the second time-series correlation data. For example, taking a user's electricity load pattern exhibiting a daily cycle characteristic, we take the user's historical electricity load time series for the most recent n days (n≥2). Dividing this historical electricity load time series into daily units yields n sub-time series, each with a length of one day. We then calculate the correlation between every two sub-time series within these n sub-time series and take the average correlation as the feature value for the user's daily cycle characteristic dimension.
[0099] Similarity calculation is used to assess the degree of difference between user electricity load time-series data, serving as a feature value for a preset feature dimension to measure the trend changes between similar phases of user electricity load over several periods. Specifically, this section explains how to perform similarity calculation to obtain third-order related data. For example, consider a user's electricity load pattern exhibiting a zero-value regularity characteristic, where non-zero values in the user's electricity load time-series data show annual periodicity. For instance, the user only uses electricity in April and October each year, and not in other months. Taking the user's historical electricity load time series for m years (m≥2), and dividing this m-year historical electricity load time series by year, we can obtain m sub-time series with a time length of one year. The similarity between these m sub-time series is calculated, and the average is taken as the feature value of the user's zero-value regularity characteristic. In some embodiments, the Hamming distance algorithm can be used to calculate the pairwise similarity between these m sub-time series, and the average similarity is calculated as the feature value of the user's zero-value regularity characteristic.
[0100] The above implementation method improves the accuracy of user pre-classification by constructing features related to the characteristics of pre-classified category data based on basic time series data.
[0101] In some implementations, the target pre-classification category is any one of the following: horizontal user category, zero-value regularity user category, annual cycle user category, daily cycle user category, and random user category; wherein, the electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line; the proportion of zero values in the electricity load of the electricity account corresponding to the zero-value regularity user category exceeds a threshold and the non-zero values have a periodic pattern; the electricity load curve of the electricity account corresponding to the annual cycle user category has an annual cycle pattern; and the electricity load curve of the electricity account corresponding to the daily cycle user category has a daily cycle pattern.
[0102] Among them, horizontal users are used to indicate that the electricity load curve of a user's electricity account has a small fluctuation range and a slow fluctuation frequency; zero-value regular users are used to indicate that the electricity load curve of a user's electricity account is 0 for at least half the time, and the electricity load curve of the other time has a periodicity; annual cycle users are used to indicate that the electricity load curve of a user's electricity account has the same or similar trend in an annual cycle; daily cycle users are used to indicate that the electricity load curve of a user's electricity account has the same or similar trend in an daily cycle; random users are used to indicate that the electricity load curve of a user's electricity account is random, and no relevant pattern or periodicity of the user's electricity consumption can be obtained from the curve.
[0103] Specifically, the electricity load of the user account corresponding to the horizontal user category fluctuates little and remains basically unchanged over a period of time. For example, cold storage warehouses that typically require year-round low-temperature operation. In the embodiments of this specification, based on the electricity load time-series data of the electricity account with the characteristics corresponding to the horizontal user category, the curve jitter data on the characteristic dimension of the load curve of the electricity load time-series data of that electricity account can be calculated as the feature value on that characteristic dimension.
[0104] The electricity load of users with zero-value patterns has a large proportion of zero values, typically meaning they do not use electricity for at least six months of the year. For example, heating boiler rooms usually only require large amounts of electricity for heating during the winter. In the embodiments of this specification, based on the time-series data of the electricity load of electricity accounts corresponding to the zero-value pattern user category, a third time-series related data on the preset value feature dimension of the electricity load time-series data of that electricity account can be calculated as the feature value on that feature dimension.
[0105] The annual electricity load of electricity accounts corresponding to the annual cycle type of user typically varies over similar time periods. For example, agricultural production users typically generate electricity based on the growth cycle of crops. In the embodiments of this specification, based on the time-series data of the electricity load of electricity accounts with characteristics corresponding to the annual cycle type of user, the first time-series correlation data on the first duration feature dimension of the electricity load time-series data of the electricity account can be calculated as the feature value on that feature dimension.
[0106] The daily electricity load of electricity accounts corresponding to the daily cycle type of user typically varies over similar time periods. For example, residents usually consume less electricity and have lower loads during the day and early morning, while consuming more electricity and having higher loads at night. In the embodiments of this specification, based on the time-series data of the electricity load of electricity accounts with characteristics corresponding to the daily cycle type of user, a second time-series correlation data on the second duration feature dimension of the electricity load time-series data of the electricity account can be calculated as a feature value on that feature dimension.
[0107] It should be noted that, in the embodiments of this specification, electricity accounts that do not belong to any of the following categories—horizontal user category, zero-value regular user category, annual cycle user category, and daily cycle user category—can be classified into the random user category through classification algorithms.
[0108] In the above implementation, by setting five pre-classification categories based on the characteristics presented by the electricity load curves of a large number of users, users can be initially classified according to the periodic patterns of their electricity load, thereby improving the accuracy of subsequent further clustering of users.
[0109] This specification provides a method for training a large-scale electricity load prediction model, referencing... Figure 3 As shown, the method may include the following steps.
[0110] S310. Obtain the time-series data of the electricity load and the first reference factor data of the electricity account in the target electricity cluster.
[0111] Among them, the first reference factor data is the data that can affect the electricity consumption of the electricity accounts in the target electricity consumption cluster; the target electricity consumption cluster belongs to the target pre-classification category, and the target electricity consumption cluster is one of several electricity consumption clusters corresponding to the target pre-classification category; the several electricity consumption clusters are obtained by clustering the electricity load time series data of the electricity accounts corresponding to the target pre-classification category.
[0112] Specifically, historical electricity load time-series data for each electricity account in the target electricity cluster is obtained according to the sampling period, along with corresponding first reference factor data. In some embodiments, the first reference factor data may include seasonal data and holiday data.
[0113] For example, the historical electricity load time-series data for one month is sampled for all electricity accounts in the target electricity cluster on a 7-day cycle. Taking the historical electricity load time-series data E for April as an example, the historical electricity load time-series data for electricity account U1 in the target electricity cluster is sampled, and the electricity load time-series data for electricity account U1 from April 1st to April 7th is extracted as a single electricity load time-series data D. 1(U1) =[E 1(U1) E 2(U1) E 3(U1) E 4(U1) E 5(U1) E 6(U1) E 7(U1) Extract the electricity load data from electricity account U1 from April 2nd to April 8th as a single electricity load time series data line D. 2(U1) =[E 2(U1) E 3(U1) E 4(U1) E 5(U1) E 6(U1) E 7(U1) E 8(U1) Extract the electricity load data from electricity account U1 from April 3rd to April 9th as a single electricity load time series data line D. 3(U1) =[E 3(U1) E 4(U1) E 5(U1) E 6(U1) E 7(U1) E 8(U1) E 9(U1) By doing so, we can eventually retrieve 24 time-series data points of electricity load for electricity account U1 in April.
[0114] For example, historical electricity load time-series data of electricity account U2 in the target electricity cluster is sampled, and the electricity load time-series data of electricity account U2 from April 1st to April 7th is extracted as a single electricity load time-series data D. 1(U2) =[E 1(U2) E 2(U2) E 3(U2) E 4(U2) E 5(U2) E 6(U2) E 7(U2) Extract the electricity load data from electricity account U2 from April 2nd to April 8th as a single electricity load time series data line D. 2(U2) =[E 2(U2) E 3(U2) E 4(U2) E 5(U2) E 6(U2) E 7(U2) E 8(U2) Extract the electricity load data from electricity account U2 from April 3rd to April 9th as a single electricity load time series data line D. 3(U2) =[E 3(U2) E 4(U2) E 5(U2) E 6(U2) E 7(U2) E 8(U2) E 9(U2) By doing so, we can eventually extract 24 time-series data points of electricity load for electricity account U2 in April.
[0115] Based on the above method, we can obtain the electricity load time series data by sampling the historical electricity load time series data of each electricity account in the target electricity cluster for one month.
[0116] For example, in April, April 9th, 10th, 16th, 17th, 23rd, and 30th are weekends, so the corresponding holiday data for the electricity load time series data on those dates is 0100; April 2nd and 24th are adjusted workdays, so the corresponding holiday data for the electricity load time series data on those dates is 0001; April 3rd, 4th, and 5th are the Qingming Festival holiday, so the corresponding holiday data for the electricity load time series data on those dates is 0. 010, the holiday data corresponding to the electricity load time series data for other dates is 1000; April 5th is the Qingming Festival, and April 20th is the Guyu Festival. Therefore, the solar term data corresponding to the electricity load time series data for April 5th is [0, 0, 0, 0, 1, 0, 0, ...], the solar term data corresponding to the electricity load time series data for April 20th is [0, 0, 0, 0, 0, 1, 0, ...], and the solar term data corresponding to the electricity load time series data for other dates is [0, 0, 0, 0, 0, ...].
[0117] S320. Based on the electricity load time series data and the first reference factor data, construct several first electricity consumption time series data samples.
[0118] The first electricity consumption time-series data sample contains the actual historical electricity load time-series data of each electricity account in the target electricity cluster, as well as a sample of feature data that is strongly correlated with the trend changes in electricity load, and is used to train the electricity load prediction model.
[0119] Specifically, this section explains how to construct the first electricity consumption time-series data sample. For example, the electricity load time-series data D is obtained by sampling the historical electricity load time-series data of electricity account U1 in the target electricity cluster. 1(U1) =[E 1(U1) E 2(U1) E 3(U1) E 4(U1) E 5(U1) E 6(U1) E 7(U1) The electricity load time series data D can be constructed from the holiday data [1000, 0001, 0010, 0010, 1000, 1000] for the application date, and the solar term data {[0, 0, 0, 0, 0, ...], [0, 0, 0, 0, 0, ...], [0, 0, 0, 0, 0, ...], [0, 0, 0, 0, 0, 1, 0, 0, ...], [0, 0, 0, 0, 0, ...], [0, 0, 0, 0, 0, ...]} for the application date. 1(U1) First power consumption time-series data sample D 1(U1) '.
[0120] Based on the above method, the first electricity consumption time-series data sample corresponding to the 24 electricity load time-series data of electricity account U1 in the target electricity consumption cluster in April can be obtained. Similarly, the first electricity consumption time-series data sample corresponding to the 24 electricity load time-series data of each electricity account in the target electricity consumption cluster in April can be obtained, thus forming several first electricity consumption time-series data samples for all electricity accounts in the target electricity consumption cluster.
[0121] S330. The first initial prediction model is trained using several first electricity consumption time series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity consumption cluster.
[0122] The first initial prediction model is the electricity load time series prediction model that needs to be trained. It should be noted that, in the embodiments of this specification, electricity accounts are progressively divided into electricity clusters within pre-classified categories based on the periodicity and trend of their electricity load. Accounts within the same electricity cluster have similar or identical periodicity and trends in their electricity load, while accounts in different electricity clusters have significantly different periodicity and trends. Therefore, an electricity load prediction model can be trained for all electricity accounts in each electricity cluster, so that after a target account is assigned to a target electricity cluster, its electricity load can be accurately predicted using the corresponding electricity load prediction model for that target electricity cluster.
[0123] Specifically, this section explains how to train the electricity load prediction model. Several first-series electricity consumption data samples from all electricity accounts in the constructed target electricity cluster are merged to obtain a set of first-series electricity consumption data samples for the target electricity cluster. For example, based on the aforementioned method for constructing first-series electricity consumption data samples, the set of first-series electricity consumption data samples for electricity account U1 in the target electricity cluster can be obtained:
[0124] D (U1) ={D 1(U1) ', D 2(U1) ', D 3(U1) ', ...D 24(U1) '}
[0125] For example, a first set of electricity consumption time-series data samples for electricity account Ui (where i is the number corresponding to the electricity account in the target electricity cluster) can be obtained:
[0126] D (Ui) ={D 1(Ui) ', D 2(Ui) ', D 3(Ui) ', ...D 24(Ui) '}
[0127] The first set of electricity consumption time-series data samples for the electricity accounts in the target electricity consumption cluster is merged to obtain the first set of electricity consumption time-series data samples for the target electricity consumption cluster:
[0128] C j ={D (U1) D (U2) D (U3) , ...D (Ui)}
[0129] Where i is the number corresponding to the electricity account in the target electricity cluster, and j is the number corresponding to the target electricity cluster.
[0130] In some cases, there is a need for long-term electricity load forecasting, where the length of the electricity load time series to be predicted is much longer than the length of the input electricity load time series, requiring prediction of the more distant future based on limited information. Furthermore, considering that the implementation method of this specification classifies and clusters electricity accounts hierarchically, the electricity load time series of accounts within the same electricity cluster typically exhibit similar trends between similar phases of similar periods; that is, similar phases of similar periods usually show similar subprocesses. Therefore, for example, the Autoformer long-term series forecasting model can be selected as the first initial forecasting model. By utilizing the deep decomposition architecture in the Autoformer long-term series forecasting model, the encoder progressively decomposes the periodic and trend components of the electricity load time series based on the moving average concept; the decoder models the periodic and trend components of the electricity load time series separately. Based on this progressive decomposition architecture, the model can progressively decompose latent variables during the forecasting process and obtain the forecast results of the periodic and trend components through autocorrelation mechanisms and accumulation, achieving alternating and mutually reinforcing sequence decomposition and forecast result optimization.
[0131] For example, to prevent overfitting during model training, a first set of electricity consumption time-series data samples C of the target electricity consumption cluster can be selected. j60% of the data samples are used as the training data set for training and adjusting model parameters; 20% are used as the validation data set for verifying model accuracy and adjusting model hyperparameters; and the remaining 20% are used as the test data set for verifying the model's generalization ability. The data from the training data set is input into the Autoformer long-term series prediction model. The model parameters are trained and adjusted based on the loss function, accuracy, etc., resulting in multiple trained prediction models. These models are then used to predict on the validation data set, and the model accuracy is recorded. The parameters corresponding to the best-performing prediction model are selected as hyperparameters for further optimization, resulting in the optimal prediction model. Finally, the optimal prediction model is tested using the test data set to evaluate its performance and predictive ability.
[0132] Through the above model training process, the electricity load prediction model corresponding to the target electricity cluster can be obtained.
[0133] In some implementations, the method for determining the target pre-classification category includes: acquiring second electricity consumption history data of an electricity account; wherein the second electricity consumption history data includes electricity load time series data of the electricity account and second reference factor data that can affect the electricity consumption of the electricity account; the first reference factor data is different from the second reference factor data; inputting the second electricity consumption history data of the electricity account into a pre-classification time series model for category identification, and determining the target pre-classification category of the electricity account.
[0134] It should be noted that, for the description of the method for determining the target pre-classification category in the electricity load prediction model training method in the above embodiments, please refer to the description of the method for determining the target pre-classification category in the electricity load prediction data generation method in this specification, and will not be repeated here.
[0135] In some implementations, reference Figure 4 As shown, the generation method of the pre-classification time series model includes the following steps:
[0136] S410. Obtain the time-series data of electricity load for several electricity accounts and the second reference factor data for several electricity accounts.
[0137] Among them, some electricity accounts are those that have not yet been classified according to electricity load patterns.
[0138] It should be noted that, regarding the description of obtaining the time-series data of the electricity load of several electricity accounts in the above embodiments, please refer to the description of obtaining the time-series data of the electricity load of the electricity accounts in the target electricity cluster in this specification; and regarding the description of obtaining the second reference factor data of the electricity accounts in the above embodiments, please refer to the description of the construction of relevant data on the relevant feature dimensions included in the second reference factor data in this specification, which will not be repeated here.
[0139] For example, the second reference factor data includes horizontal curve feature data of the horizontal curve feature dimension, annual cycle feature data of the annual cycle feature dimension, daily cycle feature data of the daily cycle feature dimension, and zero value pattern feature data of the zero value pattern feature dimension.
[0140] For example, taking the annual cyclical pattern of electricity load in user account A1 as an example, if the first time-series feature dimension of the electricity load time-series data of this electricity account is set as the annual cycle feature dimension, the annual cycle feature data of the electricity load of this user account can be calculated as Y. A1 Other eigenvalues for electricity consumption patterns can typically be 0. Obtain the time-series electricity load data D for user account A1 in April. 1(A1) =[E 1(A1) E 2(A1) E 3(A1) E 4(A1) E 5(A1) E 6(A1) E 7(A1) The characteristic data of the annual cycle feature dimension corresponding to this electricity load time series data is Y. A1 The horizontal curve characteristic data, daily cycle characteristic data, and zero-value regularity characteristic data corresponding to this electricity load time series data are all 0.
[0141] S420. Based on the electricity load time-series data of several electricity accounts and the second reference factor data of several electricity accounts, construct several second electricity time-series data samples; wherein, the second electricity time-series data samples have category description information;
[0142] The second electricity consumption time-series data sample includes actual historical electricity load time-series data of electricity accounts, as well as data related to electricity load patterns, used to train the pre-classification time-series model. Category description information can be category labels added to the second electricity consumption time-series data of the electricity accounts based on their electricity load patterns; the category description information corresponds to the pre-classification categories.
[0143] Specifically, this section explains how to construct a second electricity consumption time-series data sample. For example, for user account A1 with an annual cyclical pattern in its electricity load, a label value y can be added to the electricity load time-series data of user account A1. Based on the obtained electricity load time-series data D of electricity account A1... 1(A1) The calculated characteristic data Y of the annual cycle characteristic dimension corresponding to the time series data of this electricity load. A1 Using horizontal curve feature data (0), daily cycle feature data (0), zero-value regularity feature data (0), and label value y, the electricity load time series data D can be constructed. 1(A1) The second power consumption time sequence data sample D1(A1)'.
[0144] Based on the above method, we can obtain the second electricity consumption time-series data sample corresponding to the 24 electricity load time-series data points for electricity account A1 in April. Similarly, we can obtain the second electricity consumption time-series data samples corresponding to the 24 electricity load time-series data points for each of the other electricity accounts in April, thus forming several second electricity consumption time-series data samples for all electricity accounts.
[0145] S430. Input the second electricity consumption time series data sample with category description information into the second initial prediction model for training to obtain the pre-classified time series model.
[0146] The second initial prediction model is a time series classification model of electricity load that needs to be trained.
[0147] For example, an LSTM time series classification model can be selected as the second initial prediction model. It should be noted that the description of the training method for the second initial prediction model in the above embodiments is the same as the description of the training method for the first initial prediction model in this specification; details will not be repeated here.
[0148] Through the above model training process, a pre-classification time series model can be obtained for preliminary classification and identification of electricity accounts based on the electricity load patterns of electricity accounts.
[0149] In some implementations, the electricity load time-series data corresponds to the basic feature dimension; the first reference factor data includes at least one of the following: solar term data in the seasonal feature dimension, holiday data in the holiday feature dimension, and festival data in the ethnic festival feature dimension.
[0150] It should be noted that for the description of the electricity load time series data and the first reference factor data in the above embodiments, please refer to the description of the electricity load time series data and the first reference factor data in one embodiment of the electricity load prediction generation method in this specification, which will not be repeated here.
[0151] In some implementations, the second reference factor data included in the second electricity consumption history data includes at least one of: curve jitter data on the load curve feature dimension, first time-series correlation data on the first duration feature dimension, second time-series correlation data on the second duration feature dimension, and third time-series correlation data on the preset value feature dimension; wherein, the first duration and the second duration are not equal; the first time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the first duration; the second time-series correlation data is obtained by correlation calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the second duration; and the third time-series correlation data is obtained by similarity calculation based on the electricity load time-series data within the preset time period and its sub-time-series data within the third duration.
[0152] It should be noted that for the description of the second historical electricity consumption data in the above embodiments, please refer to the description of the second historical electricity consumption data in one embodiment of the electricity load prediction generation method in this specification, and will not be repeated here.
[0153] This specification provides an embodiment of a device for generating large-scale electricity load forecasting data, referencing... Figure 5 As shown, the electricity load forecasting data generation device 500 includes: a category cluster determination module 510, a forecasting model search module 520, and a forecasting data generation module 530.
[0154] The category cluster determination module 510 is used to determine the target pre-classification category of the target account and the target electricity consumption cluster in which the target account belongs; wherein, the target electricity consumption cluster is one of several electricity consumption clusters corresponding to the target pre-classification category; the several electricity consumption clusters are obtained by clustering the electricity load time series data of the electricity consumption account corresponding to the target pre-classification category; the electricity consumption cluster corresponds to the electricity load prediction model.
[0155] The prediction model lookup module 520 is used to search among several power load prediction models corresponding to the target pre-classification category based on the target power cluster, and obtain the target power load prediction model corresponding to the target power cluster.
[0156] The prediction data generation module 530 is used to predict the first historical electricity consumption data of the target account through the target electricity load prediction model and generate electricity load prediction data; wherein, the first historical electricity consumption data includes the time series data of the target account's electricity load and the first reference factor data that can affect the electricity consumption of the target account.
[0157] Specific limitations regarding the electricity load forecasting data generation device can be found in the limitations regarding the electricity load forecasting data generation method described above, and will not be repeated here. Each module in the aforementioned electricity load forecasting data generation device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device in hardware form, or stored in the memory of a computer device in software form, so that the processor can call and execute the corresponding operations of each module.
[0158] This specification provides a training device for a large-scale electricity load prediction model, referencing... Figure 6 As shown, the electricity load prediction model training device 600 includes: a data acquisition module 610, a sample construction module 620, and a model training module 630.
[0159] The data acquisition module 610 is used to acquire the time-series data of the electricity load of the electricity account in the target electricity cluster and the data of the first reference factor; wherein, the data of the first reference factor is the data that can affect the electricity consumption of the electricity account in the target electricity cluster; the target electricity cluster belongs to the target pre-classification category, and the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the time-series data of the electricity load of the electricity account corresponding to the target pre-classification category.
[0160] The sample construction module 620 is used to construct several first electricity consumption time series data samples based on electricity load time series data and first reference factor data.
[0161] The model training module 630 is used to train the first initial prediction model using several first electricity consumption time series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity consumption cluster.
[0162] Specific limitations regarding the electricity load forecasting model training device can be found in the limitations of the electricity load forecasting model training method described above, and will not be repeated here. Each module in the aforementioned electricity load forecasting model training device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device in hardware form, or stored in the memory of a computer device in software form, so that the processor can call and execute the corresponding operations of each module.
[0163] This specification also provides a computer device, see embodiments thereof. Figure 7aAs shown, the computer device 700 includes a memory 710, a processor 720, and a large-volume electricity load forecasting data generation method program 730 stored in the memory 710 and executable on the processor 720. When the processor 720 executes the large-volume electricity load forecasting data generation method program 730, it implements the aforementioned large-volume electricity load forecasting data generation method.
[0164] This specification also provides a computer device, see embodiments thereof. Figure 7b As shown, the computer device 800 includes a memory 810, a processor 820, and a large-scale electricity load prediction model training method program 830 stored in the memory 810 and run on the processor 820. When the processor 820 executes the large-scale electricity load prediction model training method program 830, it implements the aforementioned large-scale electricity load prediction model training method.
[0165] This specification also provides a computer-readable storage medium storing a method program for generating large-scale electricity load forecasting data, which, when executed by a processor, implements the aforementioned method for generating large-scale electricity load forecasting data.
[0166] This specification also provides a computer-readable storage medium storing a program for training a large-scale electricity load prediction model, which, when executed by a processor, implements the aforementioned large-scale electricity load prediction model training method.
[0167] It should be noted that the logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of computer-readable media include: an electrical connection having one or more wires (electronic device), a portable computer disk drive (magnetic device), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Alternatively, the computer-readable medium may be paper or other suitable media on which the program can be printed, since the program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in a computer memory.
[0168] It should be understood that various parts of the present invention can be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.
[0169] In the description of this specification, references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0170] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this invention, "a plurality of" means at least two, such as two, three, etc., unless otherwise explicitly specified.
[0171] In this invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," "linking," and "fixing," etc., should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral part; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; they can refer to the internal communication of two components or the interaction between two components, unless otherwise explicitly limited. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.
[0172] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A method for generating large-scale electricity load forecasting data, characterized in that, The method includes: The target pre-classification category of the target account and the target electricity consumption cluster in which the target account belongs are determined; wherein, the target electricity consumption cluster is one of several electricity consumption clusters corresponding to the target pre-classification category; the several electricity consumption clusters are obtained by clustering the electricity load time series data of the electricity consumption account corresponding to the target pre-classification category; the electricity consumption cluster corresponds to an electricity load prediction model; Based on the target electricity consumption cluster, a search is conducted among several electricity load prediction models corresponding to the target pre-classification category to obtain the target electricity load prediction model corresponding to the target electricity consumption cluster; The target electricity load prediction model is used to predict the first historical electricity consumption data of the target account to generate the predicted electricity load data; wherein, the first historical electricity consumption data includes the time series data of the electricity load of the target account, and the first reference factor data that can affect the electricity consumption of the target account; The target pre-classification category is any one of the following: horizontal user category, zero-value regular user category, annual cycle user category, daily cycle user category, and random user category; The electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line. The zero-value pattern refers to the electricity load of the electricity account corresponding to the user category where the proportion of zero values exceeds the threshold and the non-zero values exhibit a periodic pattern. The electricity load curve of the electricity account corresponding to the annual periodic user category exhibits an annual periodic pattern. The electricity load curves of the electricity accounts corresponding to the daily periodic user category exhibit a daily periodic pattern.
2. The method according to claim 1, characterized in that, The method for determining the target pre-classification category of the target account includes: Obtain the second electricity consumption history data of the target account; wherein, the second electricity consumption history data includes the electricity load time series data of the target account, and second reference factor data that can affect the electricity consumption of the target account; the first reference factor data is different from the second reference factor data; The second electricity consumption history data of the target account is input into the pre-classification time series model for category identification to determine the target pre-classification category of the target account.
3. The method according to claim 1 or 2, characterized in that, Before determining the target electricity cluster where the target account is located, the method further includes: Based on the electricity load time-series data of the target account, the target electricity cluster is determined from several electricity clusters corresponding to the target pre-classification category; wherein, the electricity load pattern of the electricity accounts in the target electricity cluster is similar to or close to the user load pattern of the target account. The target account is assigned to the target electricity cluster.
4. The method according to claim 1, characterized in that, The time-series data of electricity load corresponds to the basic feature dimensions; The first reference factor data of the first electricity consumption history data includes at least one of the following: solar term data in the seasonal characteristics dimension, holiday data in the holiday characteristics dimension, and festival data in the ethnic festival characteristics dimension.
5. The method according to claim 2, characterized in that, The second reference factor data included in the second electricity consumption history data includes at least one of the following: curve jitter data in the load curve characteristic dimension, first time-series correlation data in the first duration characteristic dimension, second time-series correlation data in the second duration characteristic dimension, and third time-series correlation data in the preset value characteristic dimension; wherein the first duration and the second duration are not equal; The first time-series correlation data is obtained by performing correlation calculations on the electricity load time-series data within a preset time period and the sub-time-series data within the first time period; The second time-series correlation data is obtained by performing correlation calculations on the electricity load time-series data within a preset time period and the sub-time-series data within the second time period; The third time-series related data is obtained by calculating the similarity between the electricity load time-series data within a preset time period and the sub-time-series data within the third time period.
6. A training method for a large-scale electricity load prediction model, characterized in that, The method includes: Obtain the time-series data of electricity load for electricity accounts within the target electricity cluster and the data of a first reference factor; wherein, the first reference factor data is data that can affect the electricity consumption of the electricity accounts within the target electricity cluster; the target electricity cluster belongs to a target pre-classification category, and the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the time-series data of electricity load for electricity accounts corresponding to the target pre-classification category; Based on the electricity load time-series data and the first reference factor data, several first electricity time-series data samples are constructed. The first initial prediction model is trained using the aforementioned first electricity consumption time-series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity consumption cluster; The target pre-classification category is any one of the following: horizontal user category, zero-value regular user category, annual cycle user category, daily cycle user category, and random user category; The electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line. The zero-value pattern refers to the electricity load of the electricity account corresponding to the user category where the proportion of zero values exceeds the threshold and the non-zero values exhibit a periodic pattern. The electricity load curve of the electricity account corresponding to the annual periodic user category exhibits an annual periodic pattern. The electricity load curves of the electricity accounts corresponding to the daily periodic user category exhibit a daily periodic pattern.
7. The method according to claim 6, characterized in that, The method for determining the target pre-classification category includes: Obtain the second electricity consumption history data of the electricity account; wherein, the second electricity consumption history data includes the electricity load time series data of the electricity account, and second reference factor data that can affect the electricity consumption of the electricity account; the first reference factor data is different from the second reference factor data; The second electricity consumption history data of the electricity account is input into the pre-classification time series model for category identification to determine the target pre-classification category of the electricity account.
8. The method according to claim 7, characterized in that, The generation method of the pre-classification time series model includes: Obtain time-series data of electricity load from several electricity accounts and second reference factor data for the several electricity accounts; Based on the electricity load time-series data of the aforementioned electricity accounts and the second reference factor data of the aforementioned electricity accounts, several second electricity time-series data samples are constructed; wherein, the second electricity time-series data samples have category description information; The second electricity consumption time series data sample with category description information is input into the second initial prediction model for training to obtain the pre-classified time series model.
9. The method according to claim 6, characterized in that, The time-series data of electricity load corresponds to the basic feature dimensions; The first reference factor data includes at least one of the following: solar term data in the seasonal characteristics dimension, holiday data in the holiday characteristics dimension, and festival data in the ethnic festival characteristics dimension.
10. The method according to claim 7, characterized in that, The second reference factor data included in the second electricity consumption history data includes at least one of the following: curve jitter data in the load curve characteristic dimension, first time-series correlation data in the first duration characteristic dimension, second time-series correlation data in the second duration characteristic dimension, and third time-series correlation data in the preset value characteristic dimension; wherein, the first duration and the second duration are not equal; The first time-series correlation data is obtained by performing correlation calculations on the electricity load time-series data within a preset time period and the sub-time-series data within the first time period; The second time-series correlation data is obtained by performing correlation calculations on the electricity load time-series data within a preset time period and the sub-time-series data within the second time period; The third time-series related data is obtained by calculating the similarity between the electricity load time-series data within a preset time period and the sub-time-series data within the third time period.
11. A device for generating large-scale electricity load forecasting data, characterized in that, The device includes: The category cluster determination module is used to determine the target pre-classification category of the target account and the target electricity consumption cluster in which the target account belongs; wherein, the target electricity consumption cluster is one of several electricity consumption clusters corresponding to the target pre-classification category; the several electricity consumption clusters are obtained by clustering the electricity load time series data of the electricity consumption account corresponding to the target pre-classification category; the electricity consumption cluster corresponds to an electricity load prediction model; The prediction model search module is used to search among several power load prediction models corresponding to the target pre-classification category based on the target power cluster, and obtain the target power load prediction model corresponding to the target power cluster. The prediction data generation module is used to predict the first historical electricity consumption data of the target account through the target electricity load prediction model, and generate the electricity load prediction data; wherein, the first historical electricity consumption data includes the time series data of the electricity load of the target account, and the first reference factor data that can affect the electricity consumption of the target account; The target pre-classification category is any one of the following: horizontal user category, zero-value regular user category, annual cycle user category, daily cycle user category, and random user category; The electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line. The zero-value pattern refers to the electricity load of the electricity account corresponding to the user category where the proportion of zero values exceeds the threshold and the non-zero values exhibit a periodic pattern. The electricity load curve of the electricity account corresponding to the annual periodic user category exhibits an annual periodic pattern. The electricity load curves of the electricity accounts corresponding to the daily periodic user category exhibit a daily periodic pattern.
12. A training device for a large-scale electricity load prediction model, characterized in that, The device includes: The data acquisition module is used to acquire the time-series data of electricity load of electricity accounts in the target electricity cluster and the data of a first reference factor; wherein, the first reference factor data is data that can affect the electricity consumption of the electricity accounts in the target electricity cluster; the target electricity cluster belongs to a target pre-classification category, and the target electricity cluster is one of several electricity clusters corresponding to the target pre-classification category; the several electricity clusters are obtained by clustering the time-series data of electricity load of electricity accounts corresponding to the target pre-classification category; The sample construction module is used to construct several first electricity consumption time series data samples based on the electricity load time series data and the first reference factor data; The model training module is used to train the first initial prediction model using the several first electricity consumption time-series data samples to obtain the electricity load prediction model; wherein, the electricity load prediction model corresponds to the target electricity consumption cluster; The target pre-classification category is any one of the following: horizontal user category, zero-value regular user category, annual cycle user category, daily cycle user category, and random user category; The electricity load curve of the electricity account corresponding to the horizontal user category is approximately a straight line. The zero-value pattern refers to the electricity load of the electricity account corresponding to the user category where the proportion of zero values exceeds the threshold and the non-zero values exhibit a periodic pattern. The electricity load curve of the electricity account corresponding to the annual periodic user category exhibits an annual periodic pattern. The electricity load curves of the electricity accounts corresponding to the daily periodic user category exhibit a daily periodic pattern.
13. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 10.
14. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 10.