[0020] The automatic recommendation method of the present invention will be described in detail below.
[0021] 1. Construction of historical case database
[0022] To make a suitable model recommendation, we must first establish a historical database. The original data provided for the present invention comes from planning-related data of a large number of cities in China, mainly including: historical load data of the predicted area in the past, data on related factors, such as the proportion of the secondary industry and the type of city , Urban administrative functions, urban population development, urban load development status, forecast time limit, urban GDP development level, etc. In order to obtain accurate and reasonable conclusions, the applicability of the model is also analyzed as a continuous quantity. Here, the applicability of the model is set as a value in the range of 0-1, where 0 is the lowest applicable level, indicating that the model is not applicable ; 1 is the highest level of model applicability, indicating that the model is fully applicable to the forecast area; any real number between 0-1 represents the degree of applicability of the model.
[0023] 2. Data cluster analysis and generalization
[0024] The historical data of all loads and related factors are continuous, and they must be discretized before mining association rules. In the present invention, the fitness setting of the model is proposed, that is, whether the model is applied or not is also considered to be able to perform Discretized data sets a range for the fitness, and decides its value according to the specific situation. In the method, the number of clustering centers is set for various types of data, so that similar data can be divided into several levels to ensure that the data difference between the same level is small and the difference between different levels is relatively large. The invention uses the K-Means algorithm to perform cluster analysis.
[0025] 3. Suitable for model rule base construction
[0026] The construction of the suitable model rule base mainly adopts the association rule mining method, and the association rule mining in the present invention uses the FP-grows mining algorithm.
[0027] 3.1 Basic concepts involved in using the FP-grows algorithm
[0028] 1)FP-Tree
[0029] After sorting the transaction data items in the transaction data table according to the support degree, insert the data items in each transaction in descending order into a tree with NULL as the root node, and record the data at each node. The degree of support that the node appears.
[0030] 2) Conditional mode base
[0031] Contains the set of prefix paths that appear with the suffix pattern in FP-Tree.
[0032] 3) Condition tree
[0033] A new FP-Tree is formed based on the conditional pattern base according to the FP-Tree construction principle.
[0034] 3.2 The main process of FP-Growth algorithm
[0035] 1) Construct FP-Tree, the main algorithm is as follows:
[0036] Input: a transaction database DB and a minimum support threshold.
[0037] Output: its FP-Tree.
[0038] step:
[0039] ① Scan the database DB again. Get the set F of frequent items and the support degree of each frequent item. Arrange F in descending order of support and mark the result as L.
[0040] ②Create the root node of FP-Tree, mark it as T, and mark it as null. Then do the following steps for each transaction Trans in the DB.
[0041]According to the order in L, select and sort the transaction items in Trans. Denote the sorted transaction item list in Trans as [p|P], where p is the first element and P is the rest of the list. Call insert_tree([p|P], T).
[0042] The function insert_tree([p|P], T) runs as follows: if T has a child node N, where N.item-name=p.item-name, then the count field value of N is increased by 1; otherwise, create For a new node N, make its count equal to 1, make its parent node T, and make its node_link concatenate with those with the same item_name field. If P is not empty, insert_tree(P, N) is called recursively.
[0043] 2) Mining FP-Tree, the main algorithm is as follows:
[0044] Input: a tree created in step A
[0045] Output: all frequent sets
[0046] Step: Call FP-Growth(Tree, null).
[0047] procedure FP-Growth(Tree, x)
[0048] {
[0049] ①if (Tree only contains a single path P) then
[0050] ② For each combination of nodes in path P (denoted as B)
[0051] ③Generate mode B and x, support number = minimum support of all nodes in B
[0052] ④else for each ai and do on the tree head
[0053] {
[0054] ⑤ Generation mode B=ai and x, support degree=ai.support;
[0055] ⑥Construct B's conditional pattern library and B's conditional FP tree TreeB;
[0056] ⑦ifTreeB! = Empty set
[0057] ⑧then call FP-Growth(TreeB, B)
[0058] }
[0059] }.
[0060] 4. Recommended for suitable models
[0061] After discretizing the relevant factors and other conditions of the area to be predicted, the relevant condition description of the area is obtained, and this condition is matched with the conditions in the suitable model rule base to draw a matching conclusion. When the corresponding matching condition does not exist in the rule base, the original method of obtaining the accuracy of the model can be used to combine expert intervention and other measures to draw a conclusion on whether the model is applied or not.
[0062] Discretize the relevant factors and other conditions of the area to be predicted to obtain a classification expression of the relevant factors and conditions of the area. The obtained conditions are matched with the rules in the rule scheme library in step 4, and the conclusion with the highest matching degree is drawn. The matching conclusion has the following 3 situations:
[0063] (1) If there is only one complete matching conclusion in the association rule library, then this matching conclusion is determined as the current model application scheme;
[0064] (2) If the rule database contains multiple records with the same known conditions, sort them according to the degree of support and relevance from high to low, and select the first conclusion with the highest degree of support as the regional model adaptation situation.
[0065] (3) When there is no corresponding matching condition in the rule base, it can be converted to the original method of obtaining the accuracy level of the model combined with expert intervention and other measures to draw a conclusion on whether the model is applied or not.
[0066] The method of obtaining the accuracy level of the model combines expert intervention to get the model application. The main steps are as follows:
[0067] 1) Compare the forecast results of all models with the current current annual load data, and set the different fitting accuracy levels of the model according to the average relative error, and use q% to represent the relative error;
[0068] When q%
[0069] 2) The user can filter out suitable models by setting the minimum allowable level of fitting accuracy.
[0070] The present invention will be described in detail below in conjunction with factors related to actual load in a certain area.
[0071] 1. Construction of historical case database
[0072]
[0073]
[0074] Table 1 Historical load data and related factors of a large number of cities in China
[0075] According to the planning data of a large number of domestic cities, the eligible data is selected. In order to simplify the analysis of the calculation example, only three actual relevant factors of the proportion of the secondary industry in the predicted area, the population size, and the GDP development level are considered here, and the applicability of the model is also taken as A related factor is considered, and its value range is set between 0-1. When determining the applicability of the model, it is necessary to combine factors such as local characteristics and expert experience and preferences. A summary table of historical load data and related factors of a large number of cities in China shown in Table 2 is initially obtained.
[0076] 2. Data cluster analysis and generalization
[0077] This is to discretize the continuous data of historical values to facilitate the next step of mining association rules.
[0078] According to the set number of cluster centers, the method can automatically discretize the given data. In this calculation example, 5 cluster centers are set, and each relevant factor can be divided into 5 levels.
[0079]
[0080] Table 2 Various types of historical data generalization centers
[0081]
[0082] Table 3 Local historical data after generalization
[0083] According to the generalization centers of various types of data obtained, the data in the historical database can be discretized and classified, and the clustering center is set to 5, so all types of data can be divided into 5 levels, of which the proportion of secondary production is E , The population is represented by P, GDP is represented by R, the model applicability is represented by T, and the classification is 0-4, a total of 5 levels. The discretized data table is shown in Table 3.
[0084] 3. Suitable for model rule base construction
[0085] Calling the association rule mining algorithm in the method, the available association rules are shown in Table 4.
[0086]
[0087] Table 4 Local rule base after association rule mining
[0088] Analyzing the obtained rules, the following conclusions can be drawn for the studied polynomial model:
[0089] The proportion of secondary production plays a key role in determining whether the polynomial model is applied or not. GDP is the second, and population has the least decisive role. The polynomial model is suitable for cities with a low GDP, a small population, and a large proportion of the secondary industry.
[0090] 4. Recommended for suitable models
[0091] From step 3, the association rule scheme library can be used to match the rules, and the relevant factor conditions of a given prediction area are discretized using discretization standards. If the relevant conditions after discretization at this time are generalized The condition is E2_P1_R1. For the polynomial model, the conclusion after matching is that E2_P1_R1:T0 is a perfect match. That is, the polynomial model is not applicable under the given conditions. Finally, the model recommendation conclusion can be drawn.
[0092] For all the existing load forecasting models, such as envelope upper limit model, envelope lower limit model, linear regression line model, constant growth rate model, etc., follow the above steps to obtain the adaptation of each model to the area to be predicted.
[0093]
[0094] Table 6 Recommended partial conclusions of the power load forecasting model based on association rules
[0095] The local conclusions obtained in this example are shown in Table 6.
[0096] Combining the particularity of the power industry, the present invention applies association rules to the selection analysis of load forecasting models, and proposes basic ideas and specific solutions for using association rules in data mining technology to analyze power load forecasting models. In actual forecasting work, humans can roughly judge whether to adopt a certain model. The advantage is that it does not rely on a single expert, it can synthesize the experience of a large number of experts, and through the accumulation of a large amount of forecasting data and data mining, it can discover rules that are not easy to find intuitively.