A building load forecasting method based on forward decomposition denoising and deep ensemble learning
By removing noise from building load data using the CEEMDAN algorithm and employing the DWedRVFL neural network model for dynamic ensemble learning, the problems of noise interference and fluctuation in building load forecasting are solved, thereby improving forecast accuracy and stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- FUZHOU UNIV
- Filing Date
- 2024-12-03
- Publication Date
- 2026-06-19
Smart Images

Figure CN119623749B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of load forecasting technology, and specifically to a building load forecasting method based on forward decomposition denoising and deep ensemble learning. Background Technology
[0002] Promoting sustainable urban and social development is becoming increasingly important for addressing the energy and environmental crisis. Various methods based on Building Energy Management Systems (BEMS), such as comprehensive low-carbon energy consumption assessments, carbon trading mechanisms, and carbon footprint assessments, are gradually being applied. However, the efficient and stable operation of BEMS requires accurate prediction of building loads.
[0003] Building load data may be affected by measurement noise and external interference during the acquisition process, making it difficult to identify effective information. Furthermore, factors such as human activity and weather conditions cause significant irregular fluctuations in building load, and the time-varying nature of the load itself further exacerbates the complexity of load pattern recognition. Therefore, to improve the accuracy of building load forecasting, it is necessary to remove noise to improve data quality, and the strong volatility and time-varying nature of building load should be fully considered during the model design phase.
[0004] Chinese patent application number 202311481491.1 discloses a short-term building load forecasting method. This method extracts load data from a building over a specific time period as raw data, processes the raw data for outliers and missing values to obtain feature data, establishes a TCN-LSTM model based on a residual self-attention mechanism, trains the model using the feature data to obtain predicted load values, optimizes the hyperparameters of the RSA-TCN-LSTM model using an improved sparrow search algorithm, and uses the optimized RSA-TCN-LSTM prediction model to predict the building load. However, this method does not consider noise interference during the building load acquisition process, making it difficult to obtain high-quality load data.
[0005] Chinese patent application number 202310740982.7 discloses a method and system for commercial building load forecasting based on Informer networks. This method collects historical energy consumption metadata information of different types of commercial buildings to construct load training datasets corresponding to different categories of commercial buildings; obtains temperature datasets from meteorological bureau temperature data; trains the temperature datasets and different types of load data using Informer networks to obtain load forecasting models and temperature forecasting models corresponding to different categories of commercial buildings; and combines the temperature data predicted by the temperature forecasting models with the load forecasting models to obtain the final load forecasting result. However, this method does not include customized modeling design for the strong fluctuations and time-varying characteristics of building loads, thus affecting the accuracy of load forecasting. Summary of the Invention
[0006] The purpose of this invention is to provide a building load forecasting method based on forward decomposition denoising and deep ensemble learning, which is beneficial to improving the accuracy and stability of short-term building load forecasting.
[0007] To achieve the above objectives, the technical solution adopted in this invention is as follows: a building load forecasting method based on forward decomposition denoising and deep ensemble learning. For the original load data, the CEEMDAN algorithm is first used to forward decompose the load data into multiple IMFs components. After removing high-frequency noise components, the remaining components are reconstructed and fused to obtain clean load data. Then, to cope with the strong volatility and time-varying nature of building load data, a DWedRVFL neural network model is constructed. The DWedRVFL neural network model uses the latest prediction accuracy to capture the dynamic changes of the load sequence and constructs a diversity index to utilize the multi-scale features of different output layers. In addition, a ranking strategy is constructed to integrate the contributions of the latest accuracy and diversity, while avoiding the impact of abnormal prediction values on the combined prediction. Finally, historical load, weather conditions, and occupancy data are combined into a connection matrix and fed into the DWedRVFL neural network model for learning, thereby improving the accuracy and stability of building load forecasting.
[0008] Furthermore, the implementation method of the CEEMDAN algorithm is as follows:
[0009] Step 1: Let the collected raw load data be... Instead of decomposing the entire signal, a forward rolling decomposition method is used. Specifically, CEEMDAN decomposes the data within a forward rolling window of size w. The data within w only includes known load data and does not involve future data. The load data used for decomposition is represented as follows:
[0010] Step 2: Add a Gaussian white noise sequence to the original load data S L Add a set of Gaussian white noise sequences to form new data S new As shown below:
[0011] S new =S L +γ0G i
[0012] Where i is the number of trials; G i It is a normally distributed Gaussian noise sequence; γ0 is the ratio of data to noise, used to control the additional noise in the original dataset;
[0013] Step 3: Perform S according to the EMD method new Decompose; from signal Snew Extract local maxima and local minima, and calculate the mean M of the upper envelope U and lower envelope L based on the local extrema:
[0014]
[0015] Step 4: From the original signal S new Subtract the mean M from the mean to obtain the preliminary modal characteristic components h1:
[0016] h1 = S new -M
[0017] Step 5: Check if h1 satisfies the two conditions that the number of zero-crossing points should be equal to the number of extreme points and that amplitude symmetry is maintained. If not, treat it as a new signal and repeat steps 3 and 4 until the IMF condition is met. After convergence, the first component of EMD is initially obtained. Then, the first component IMF1 of CEEMDAN is obtained by calculating the mean, as shown below:
[0018]
[0019] Where T represents the total number of times white noise is added;
[0020] Step 6: Calculate the first residual term r1, as shown below:
[0021] r1=S L -IMF1
[0022] Step 7: Add noise to the first residual term r1 to obtain the new sequence r1+γ1E1(G i The new sequence is decomposed according to the following formula to obtain the second component IMF2 and the second residual term r2:
[0023]
[0024] r2 = r1 - IMF2
[0025] Step 8: Calculate the k-th order residual term r using the same method as in Step 7. k and the (k+1)th order component IMF k+1 :
[0026] r k =r k-1 -IMF k
[0027]
[0028] Step 9: Repeat step 8 until the obtained residual term can no longer be decomposed. The final residual term R is calculated as follows:
[0029]
[0030] Step 10: Assuming component IMF1 is the high-frequency noise component that needs to be removed, reconstruct the clean load data by combining the remaining components (excluding IMF1) and the residual term R.
[0031]
[0032] Step 11: X L X represents the denoised historical load sequence. n and X w These represent building occupancy data obtained from the attendance tracking system and weather forecast data obtained from the BEMS system, respectively; the predictive modeling objective is to establish the following mapping relationship:
[0033]
[0034] In the formula, Let be the predicted value for the next time step, and 'e' be the error; the rolling window 'w' determines the amount of historical load data used; where the historical load sequence is represented as... And X n X W It is a scalar; [X] L ,X n ,X W The connection matrix X is formed as the input to the prediction model, and the output is the predicted load value for the next time step.
[0035] Furthermore, the implementation method of the DWedRVFL neural network model is as follows:
[0036] Input data is Where N and m are the number of samples and the number of features in the input data; where P is defined as the number of neurons in the enhancement layer; the features of the first enhancement layer of DWedRVFL are represented as follows: Calculated by the following formula:
[0037] H 1 =g(XB1)
[0038] Where g(·) represents a nonlinear activation function, The connection weights are randomly generated for the first enhancement layer;
[0039] For a DWedRVFL model with L augmentation layers, the input of the l-th augmentation layer consists of the initial input features X and the output H of the previous augmentation layer. l-1 If the composition is such that the calculation of the l-th enhancement layer can be defined by the following formula:
[0040]
[0041] in, The weights and enhancement features of the l-th enhancement layer are given. Let the output be the l-th layer; then the loss function of the l-th enhancement layer can be expressed as:
[0042]
[0043] In the formula, β l Let β be the connection weight of layer l, and λ be the regularization parameter; generally, β l The calculation can be performed analytically using the Moore-Penrose pseudo-inverse method or the ridge regression method, expressed as:
[0044]
[0045] In the formula, To expand the input matrix, I is the identity matrix; each RVFL module in DWedRVFL corresponds to a decision subtask, and the output weights of each layer are calculated separately; all weights of the hidden layers in DWedRVFL are randomly generated before training begins and remain fixed during network training.
[0046] To address the high volatility and time-varying characteristics of building loads, a dynamic integration module is designed. The outputs of different enhancement layers are combined by dynamically updating weights to achieve a trade-off between the latest accuracy and versatility. The prediction output of a network with L enhancement layers is... The design goal of the integrated module is to find a set of combined weights that minimizes the root mean square error (RMSE) between the predicted and actual values. The objective function is expressed as:
[0047]
[0048] Furthermore, to enhance the adaptability of DWedRVFL to the time-varying characteristics of building loads, DWedRVFL introduces a contribution index based on the latest accuracy, defined as a function of the latest measurement prediction error:
[0049]
[0050] In the formula, f l Let represent the contribution index of the l-th layer at time t-1 based on the latest accuracy. To prevent the weight of the dominant output layer from being too large, a sorting method is used to sort the contribution index values of each layer from largest to smallest, resulting in the following order:
[0051] f1≥f2≥…≥f l ≥…≥f L
[0052] Then, the latest precision contributions of the l-th layer are sorted as follows:
[0053]
[0054] in, F represents the index value after sorting at level l. l This indicates the ranking of their contributions to the latest accuracy; if the latest accuracy contribution of the l-th layer is the largest, then its ranking index is 1, E. l Equals L; for each output layer, the rank value F l The higher the value, the greater its contribution to the latest accuracy, indicating that it has a higher level of accuracy.
[0055] Furthermore, DWedRVFL introduces diversity to ensure the effective utilization of multi-scale enhancement features from different output layers, thereby effectively addressing the volatility of building loads; DWedRVFL promotes the diversity of the overall forecast by assigning greater weight to the output layers that contribute to diversity; firstly, the Euclidean distance between each layer's forecast and other layer forecasts is defined as the diversity contribution:
[0056]
[0057] Among them, E l The sum of the Euclidean distances between the predictions of layer l and other layers at time t is given by E, which is then used to calculate the prediction distribution. l Defined as a diversity contribution indicator; then sorted in descending order of diversity contribution indicators, the resulting order is as follows:
[0058] E1≥E2≥…≥E l ≥…≥E L
[0059] The diversity ranking of the l-th layer is then assigned the following value:
[0060]
[0061] In the formula, For E l Index sorted in descending order This is a diversity ranking based on the predicted distribution; for output layer l, the greater its contribution to diversity, The larger;
[0062] To obtain a more comprehensive measure of diversity, DWedRVFL further defines diversity based on prediction performance. The diversity contribution index based on prediction performance is calculated as the difference between a single prediction and the median of all predictions. When calculating bias, the true label at time t is unknown; therefore, the latest prediction error is used to calculate the bias. The bias of the l-th layer at time (t-1) is... for:
[0063]
[0064] Will Sort in descending order to obtain the ranking of the l-th layer:
[0065]
[0066] in, These are the index and diversity contribution rankings based on prediction performance, respectively. Based on the above two diversity calculation results, the overall diversity contribution ranking of layer l is defined as:
[0067]
[0068] Where, d l The contribution index to the overall diversity of the l-th layer, For d l The index value after reverse sorting; D l This represents the ranking value of the l-th layer based on the comprehensive diversity index.
[0069] Comprehensive F l and D l Based on the ranking information, define the final contribution index r of the l-th output layer. l for:
[0070] r l =αF l +(1-α)D l
[0071] Where α is the trade-off parameter;
[0072] DWedRVFL uses a ranking-based strategy to calculate the final ranking R of the l-th layer. l and combined weights
[0073] r1≥r2≥…≥r l ≥…≥r L ,
[0074]
[0075] in, For r l The index value after descending order;
[0076] The combined predicted value of DWedRVFL at time t is calculated based on the sorting strategy. The calculation method is as follows:
[0077]
[0078] Compared with existing technologies, this invention has the following advantages: First, it employs the forward CEEMDAN algorithm to perform forward decomposition and denoising on the original load data, thereby obtaining high-quality load data, which is helpful for the subsequent machine learning process. Second, in the prediction modeling stage, this invention fully considers the strong volatility and time-varying nature of building loads; it proposes a diversity metric to fully utilize the multi-scale features of different output layers, thereby obtaining more stable predictions; it develops dynamic ensemble weights based on the latest accuracy to adapt to the time-varying nature of the load; and it designs a ranking strategy to integrate the contributions of diversity and the latest accuracy to the combined prediction, while avoiding the adverse effects of outliers. Compared with patent 202311481491.1, this invention considers noise interference during the load data acquisition process and obtains higher-quality load data after denoising using the forward CEEMDAN algorithm. Compared with patent 202310740982.7, this invention specifically considers the volatility and time-varying nature of building loads in the load prediction modeling stage, designs the DWedRVFL prediction network to accurately identify the changing patterns of building loads, and improves the accuracy and stability of short-term building load prediction. Attached Figure Description
[0079] Figure 1 This is a flowchart illustrating the method implementation of an embodiment of the present invention;
[0080] Figure 2 This is a flowchart illustrating the implementation of the CEEMDAN algorithm in this embodiment of the invention.
[0081] Figure 3 This is a training framework diagram of the DWedRVFL neural network model in an embodiment of the present invention;
[0082] Figure 4 This is a graph of the raw workload data collected on weekdays in this embodiment of the invention;
[0083] Figure 5 This is a scrolling data graph of the first window in this embodiment of the invention;
[0084] Figure 6 This is a data graph with noise added in an embodiment of the present invention;
[0085] Figure 7 This is a data graph of the mean M of the upper envelope U and the lower envelope L in an embodiment of the present invention;
[0086] Figure 8 This is a preliminary modal feature component data map obtained in an embodiment of the present invention;
[0087] Figure 9 This is a data graph of the first component IMF1 and the first residual term r1 obtained in this embodiment of the invention;
[0088] Figure 10 This is a data graph of the second component IMF2 and the second residual term r2 obtained in this embodiment of the invention;
[0089] Figure 11 This is a graph of all IMFs and the final residual term R obtained in the embodiments of the present invention;
[0090] Figure 12 This is the first window-based denoising load data graph obtained in this embodiment of the invention;
[0091] Figure 13 This is a prediction result diagram from an embodiment of the present invention. Detailed Implementation
[0092] The present invention will be further described below with reference to the accompanying drawings and embodiments.
[0093] It should be noted that the following detailed descriptions are exemplary and intended to provide further explanation of this application. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains.
[0094] It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the exemplary embodiments according to this application. As used herein, the singular form is intended to include the plural form as well, unless the context clearly indicates otherwise. Furthermore, it should be understood that when the terms "comprising" and / or "including" are used in this specification, they indicate the presence of features, steps, operations, devices, components, and / or combinations thereof.
[0095] Measurement noise during building load data acquisition, along with the strong fluctuations and time-varying nature of building loads, leads to a decrease in prediction accuracy. To address these issues, this embodiment provides a building load prediction method based on forward decomposition denoising and deep ensemble learning. For the original load data, the CEEMDAN algorithm is first used to forward decompose the load data into multiple IMFs components. After removing high-frequency noise components, the remaining components are reconstructed and fused to obtain clean load data. Then, to cope with the strong fluctuations and time-varying nature of building load data, a dynamic weight ensemble deep random vector functional link (DWedRVFL) neural network model is constructed. The DWedRVFL neural network model uses the latest prediction accuracy to capture the dynamic changes of the load sequence and constructs a diversity index to utilize the multi-scale features of different output layers. In addition, a ranking strategy is constructed to integrate the contributions of the latest accuracy and diversity, while avoiding the impact of abnormal prediction values on the combined prediction. Finally, historical load, weather conditions, and occupancy data are combined into a connection matrix and fed into the DWedRVFL neural network model for learning, thereby improving the accuracy and stability of building load prediction.
[0096] Figure 1 This is a flowchart illustrating the method implementation of this embodiment. This method is applicable to general-purpose buildings, including residential, commercial, and educational buildings; buildings with energy-consuming equipment that generates loads; and load data recorded and stored by load metering equipment within the building. This method is not applicable to older buildings that have not undergone electrification upgrades or are not equipped with load metering equipment.
[0097] Figure 2 This is a flowchart illustrating the implementation of the CEEMDAN algorithm in this embodiment. Figure 2 As shown, the implementation method of the CEEMDAN algorithm is as follows:
[0098] Step 1: Let the collected raw load data be... Because the CEEMDAN decomposition process involves information from the entire data sequence, incorrectly decomposing the entire time series S L This could lead to information leakage. Therefore, this paper uses a forward rolling decomposition method instead of decomposing the entire signal. Specifically, CEEMDAN decomposes the data within a forward rolling window of size w, where the data within w only includes known load data and does not involve future data. The load data used for decomposition can then be represented as...
[0099] Step 2: Add a Gaussian white noise sequence to the original load data S. L Add a set of Gaussian white noise sequences to form new data Snew As shown below:
[0100] S new =S L +γ0G i
[0101] Where i is the number of trials; G i It is a normally distributed Gaussian noise sequence; γ0 is the ratio of data to noise, used to control the additional noise in the original dataset.
[0102] Step 3: First, analyze S using the Empirical Mode Decomposition (EMD) method. new Decompose the signal. First, start with signal S. new Extract local maxima and local minima, and calculate the mean M of the upper envelope U and lower envelope L based on the local extrema:
[0103]
[0104] Step 4: From the original signal S new Subtract the mean M from the mean to obtain the preliminary modal characteristic components h1:
[0105] h1 = S new -M
[0106] Step 5: Check if h1 satisfies the two conditions: the number of zero-crossing points should be equal to the number of extreme points and amplitude symmetry. If not, treat it as a new signal and repeat steps 3 and 4 until the IMF condition is met. After convergence, the first component of EMD is initially obtained. Then, the first component IMF1 of CEEMDAN is obtained by calculating the mean, as shown below:
[0107]
[0108] Where T represents the total number of times white noise is added.
[0109] Step 6: Calculate the first residual term r1, as shown below:
[0110] r1=S L -IMF1
[0111] Step 7: Add noise to the first residual term r1 to obtain the new sequence r1+γ1E1(G i The new sequence is decomposed according to the following formula to obtain the second component IMF2 and the second residual term r2:
[0112]
[0113] r2 = r1 - IMF2
[0114] Step 8: Calculate the k-th order residual term r using the same method as in Step 7. k and the (k+1)th order component IMF k+1 :
[0115] r k =r k-1 -IMF k
[0116]
[0117] Step 9: Repeat step 8 until the obtained residual term can no longer be decomposed. The final residual term R is calculated as follows:
[0118]
[0119] Step 10: Assuming component IMF1 is the high-frequency noise component that needs to be removed, reconstruct the clean load data by combining the remaining components (excluding IMF1) and the residual term R.
[0120]
[0121] Step 11: X L X represents the denoised historical load sequence. n and X w These represent building occupancy data obtained from the attendance tracking system and weather forecast data obtained from the BEMS system, respectively. The predictive modeling objective is to establish the following mapping relationship:
[0122]
[0123] In the formula, Let be the predicted value for the next time step, and 'e' be the error. The rolling window 'w' determines the amount of historical load data to use. The historical load sequence is represented as... And X n X W It is a scalar; [X] L ,X n ,X W The connection matrix X is formed as the input to the prediction model, and the output is the predicted load value for the next time step.
[0124] Figure 3 This is a training framework diagram of the DWedRVFL neural network model in this embodiment. For example... Figure 3 As shown, the implementation method of the DWedRVFL neural network model is as follows:
[0125] Input data is Where N and m are the number of samples and features of the input data, respectively. P is defined as the number of neurons in the enhancement layer. The feature representation of the first enhancement layer in DWedRVFL is as follows: It can be calculated using the following formula:
[0126] H 1 =g(XB1)
[0127] Where g(·) represents a nonlinear activation function, such as sigmoid, tanh, or ReLU, etc. The connection weights are randomly generated for the first enhancement layer.
[0128] For a DWedRVFL model with L augmentation layers, the input of the l-th augmentation layer consists of the initial input features X and the output H of the previous augmentation layer. l-1 If the composition is such that the calculation of the l-th enhancement layer can be defined by the following formula:
[0129] H l =g([H l-1 ,X]B l )
[0130]
[0131] in, The weights and enhancement features of the l-th enhancement layer are given. Let be the output of the l-th layer. Then the loss function of the l-th enhancement layer can be expressed as:
[0132]
[0133] In the formula, β l Let β be the connection weight of the l-th layer, and λ be the regularization parameter. Generally, β... l The calculation can be performed analytically using the Moore-Penrose pseudo-inverse method or the ridge regression method, expressed as:
[0134]
[0135] In the formula, To expand the input matrix, I is the identity matrix. Each RVFL module in DWedRVFL corresponds to a decision subtask, and the output weights of each layer are calculated separately. All weights of the hidden layers in DWedRVFL are randomly generated before training begins and remain fixed during network training.
[0136] To address the high volatility and time-varying characteristics of building loads, a dynamic integration module is designed. The outputs of different enhancement layers are combined by dynamically updating weights to achieve a trade-off between the latest accuracy and versatility. The prediction output of a network with L enhancement layers is... The design goal of the integrated module is to find a set of combined weights that minimizes the root mean square error (RMSE) between the predicted and actual values. The objective function can be expressed as:
[0137]
[0138] To enhance the adaptability of DWedRVFL to the time-varying characteristics of building loads, DWedRVFL introduces a contribution index based on the latest accuracy, defined as a function of the latest measurement prediction error:
[0139]
[0140] In the formula, f l Let represent the contribution index of the l-th layer at time t-1 based on the latest accuracy. To prevent the weight of the dominant output layer from being too large, a sorting method is used to sort the contribution index values of each layer from largest to smallest, resulting in the following order:
[0141] f1≥f2≥…≥f l ≥…≥f L
[0142] Then, the latest precision contributions of the l-th layer are sorted as follows:
[0143]
[0144] in, F represents the index value after sorting at level l. l This indicates the order of their contribution to the latest accuracy. For example, if the latest accuracy contribution of the l-th layer is the largest, then its sort index value is 1, F. l Equals L. For each output layer, the rank value F l The higher the value, the greater its contribution to the latest accuracy, indicating that it has a higher level of accuracy.
[0145] Considering the nonlinear fluctuations in building loads, single-layer forecasting may produce unstable predictions. Introducing diversity ensures the effective utilization of multi-scale enhancement features from different output layers, thus effectively addressing building load volatility. DWedRVFL promotes overall forecast diversity by assigning greater weight to output layers that contribute to diversity. First, the Euclidean distance between each layer's forecast and other layer forecasts is defined as the diversity contribution:
[0146]
[0147] Among them, E l The sum of the Euclidean distances between the predictions of layer l and other layers at time t is given by E, which is then used to calculate the prediction distribution. l Defined as a diversity contribution indicator; then sorted in descending order of diversity contribution indicators, the resulting order is as follows:
[0148] E1≥E2≥…≥E l ≥…≥E L
[0149] The diversity ranking of the l-th layer is then assigned the following value:
[0150]
[0151] In the formula, For E l Index sorted in descending order This is a diversity ranking based on the predicted distribution. For output layer l, the greater its contribution to diversity, the higher the diversity ranking. The larger.
[0152] To obtain a more comprehensive measure of diversity, DWedRVFL further defines diversity based on prediction performance. The diversity contribution index based on prediction performance is calculated as the difference between a single prediction and the median of all predictions. When calculating the bias, the true label at time t is unknown. Therefore, we use the latest prediction error to calculate the bias. The bias of layer l at time (t-1) is then... for:
[0153]
[0154] Will Sort in descending order to obtain the ranking of the l-th layer:
[0155]
[0156] in, These are the index ranking based on prediction performance and the diversity contribution ranking, respectively. Based on these two diversity calculation results, the overall diversity contribution ranking of layer l is defined as:
[0157]
[0158] Where, d l The contribution index to the overall diversity of the l-th layer, For d l The index value after reversing the sorting order. (D) l This represents the ranking value of the l-th layer based on the comprehensive diversity index.
[0159] Comprehensive F l and D l Based on the ranking information, define the final contribution index r of the l-th output layer. l for:
[0160] r l =αF l +(1-α)D l
[0161] Here, α is a trade-off parameter, with a value ranging from 0.1 to 0.9, which can be adjusted according to different datasets.
[0162] DWedRVFL uses a ranking-based strategy to calculate the final ranking R of the l-th layer. l and combined weights
[0163] r1≥r2≥…≥r l ≥…≥r L ,
[0164]
[0165] in, For r l The index value after descending order.
[0166] The combined predicted value of DWedRVFL at time t is calculated based on the sorting strategy. The calculation method is as follows:
[0167]
[0168] In summary, the ranking-based dynamic ensemble strategy adopted by DWedRVFL provides a balanced combined weight allocation scheme that ensures that excellent predictions receive greater weight while mitigating the impact of outlier predictions, thereby generating more reasonable combined prediction values.
[0169] The specific implementation of this invention will be described in detail below with reference to data examples. The specific steps are as follows:
[0170] Step 1: Raw weekday load data S collected for a certain building in June 2023 L The load curve and basic statistical indicators are as follows: Figure 4 As shown in Table 1, a forward rolling decomposition method is used instead of decomposing the entire signal. Specifically, CEEMDAN decomposes the data within a forward rolling window of size w = 48. The data within w only includes known load data and does not involve future data. Therefore, the 48 load data points S used for decomposition within the first rolling window are... L like Figure 5 As shown:
[0171] Table 1
[0172]
[0173] Step 2: Add a Gaussian white noise sequence. In the original dataset S L Add a set of Gaussian white noise sequences to form new data S new The calculation is as follows:
[0174] S new =S L +γ0G i
[0175] Where i is the number of trials, set to 3; G i Let S be a Gaussian noise sequence with a normal distribution N(0,0.5); γ0 = 0.6 is the ratio of data to noise, used to control the additional noise in the original dataset; then S new like Figure 6 As shown.
[0176] Step 3: First, according to the EMD method, S new Decompose the signal. First, start with signal S. new Extract local maxima and local minima, and calculate the mean M of the upper envelope U and lower envelope L based on the local extrema, such as... Figure 7 As shown:
[0177]
[0178] Step 4: From the original signal S new Subtracting the mean M from the mean yields the preliminary modal characteristic components h1, as shown below. Figure 8 The following is stated:
[0179] h1 = S new -M
[0180] Step 5: Check if h1 satisfies the two conditions: the number of zero-crossing points should be equal to the number of extreme points and amplitude symmetry. If not, treat it as a new signal and repeat steps 3 and 4 until the IMF condition is met. After convergence, the first component of EMD is initially obtained. Then, the first component IMF1 of CEEMDAN is obtained by calculating the mean, as shown below:
[0181]
[0182] Where T=3 is the total number of times white noise is added.
[0183] Step 6: Calculate the first residual term r1 as follows:
[0184] r1=S L -IMF1
[0185] The first component IMF1 and the first residual term r1 are then obtained as follows: Figure 9 As shown.
[0186] Step 7: Add noise to the first residual term r1 to obtain the new sequence r1+γ1E1(G iThe new sequence is decomposed according to the following formula to obtain the second component IMF2 and the second residual term r2, as shown in the figure. Figure 10 As shown:
[0187]
[0188] r2 = r1 - IMF2
[0189] Step 8: Calculate the k-th order residual term r using the same method as in Step 7. k and the (k+1)th order component IMF k+1 :
[0190] r k =r k-1 -IMF k
[0191]
[0192] Step 9: Repeat step 8 until the obtained residual term can no longer be decomposed. The final residual term R is calculated as follows:
[0193]
[0194] The calculated IMFs and the final residual term R are as follows: Figure 11 As shown.
[0195] Step 10: Assuming component IMF1 is the high-frequency noise component that needs to be removed, reconstruct the clean load data using the remaining components excluding IMF1 and the residual term R according to the following formula:
[0196]
[0197] The first window of denoising load data obtained is as follows Figure 12 As shown.
[0198] Step 11: Construct the dataset for model training. Continuously denoise the load data within the window according to steps 1-10 until the completely denoised historical load sequence X is obtained. L Next, the number of people occupying the building, X, is obtained from the building attendance system. n And weather forecast variable X obtained from the BEMS system w Because the outdoor average temperature is highly correlated with the load, X here... w This represents the outdoor temperature variable. The goal of predictive modeling is to establish the following mapping relationship:
[0199]
[0200] In the formula, Let be the predicted value for the next time step, and 'e' be the error. The rolling window w = 48 represents the amount of historical load data used. The historical load sequence is represented as follows: And X n X W It is a scalar. The connection matrix X = [X...] will be formed. L ,X n ,X W The input dataset for the prediction model has a feature dimension of 48+1+1=50 and a sample size of 1008.
[0201] Step 12: Construct the DWedRVFL neural network model. Input data is... Where N and m are the number of samples and features in the input data, respectively, which are 1008 and 50. The training set, validation set, and test set are divided in a 7:1:2 ratio, so N and m for the training data are 705 and 50, respectively. P = 160 is defined as the number of neurons in the enhancement layer. The training parameters of DWedRVFL are shown in Table 2. The feature representation of the first enhancement layer of DWedRVFL is as follows: It can be calculated using the following formula:
[0202] H 1 =g(XB1),
[0203] Where g(·) represents a nonlinear activation function, and sigmoid is chosen in this paper. The connection weights are randomly generated for the first enhancement layer.
[0204] Table 2
[0205]
[0206]
[0207] Step 13: For a DWedRVFL model with L = 10 enhancement layers, the input of the l-th enhancement layer consists of the initial input features X and the output H of the previous enhancement layer. l-1 The composition is defined according to the following formula:
[0208] H l =g([H l-1 ,X]B l ),
[0209]
[0210] in The weights and enhancement features of the l-th enhancement layer are given. Let be the output of the l-th layer. Then the loss function of the l-th enhancement layer can be expressed as:
[0211]
[0212] In the formula, β l Let β be the connection weight of layer l, and λ = 0.1 be the regularization parameter. Generally, β... l The calculation can be performed analytically using the Moore-Penrose pseudo-inverse method or the ridge regression method, expressed as:
[0213]
[0214] In the formula, To expand the input matrix, I is the identity matrix.
[0215] Step 14: The prediction model executes a dynamic weight ensemble strategy after the deep network has finished learning. The DWedRVFL network with 10 enhancement layers predicts the output at time t as follows:
[0216] The design goal of the DWedRVFL ensemble module is to find a set of combined prediction weights that minimizes the root mean square error (RMSE) between the predicted and actual values. The objective function can be expressed as:
[0217]
[0218] Step 15: To enhance the adaptability of DWedRVFL to the time-varying characteristics of building loads, DWedRVFL introduces a contribution index based on the latest accuracy, which is defined as a function of the latest measurement prediction error.
[0219]
[0220] In the formula, f l Let f3 represent the contribution index of layer l based on the latest accuracy at time t-1; taking layer 3 as an example, i.e., l=3. Then the calculated f3=0.0475. To prevent the weight of the dominant output layer from being too large, a sorting method is used to sort the contribution index values of each layer from largest to smallest, resulting in the following order:
[0221] f1≥f2≥…≥f3≥…≥f L ,
[0222] Subsequently, the latest accuracy contributions of layer 3 are sorted as follows:
[0223]
[0224] in This represents the index value after the third level of sorting, and F3=4 indicates its ranking in terms of contribution to the latest accuracy.
[0225] Step 16: DWedRVFL promotes overall prediction diversity by assigning greater weight to the output layer that contributes to diversity. First, the Euclidean distance between the third-layer predictions and predictions from other layers is defined as the diversity contribution of the third layer:
[0226]
[0227] Where E3 = 2.9004 is the sum of the Euclidean distances between the predictions of the third layer and the predictions of other layers at time t. Based on the prediction distribution, E3 is defined as the diversity contribution index. Then, the layers are sorted according to the decreasing order of their contribution to diversity, and the resulting order is as follows:
[0228] E1≥E2≥…≥E3≥…≥E L ,
[0229] The third layer diversity ranking is then assigned the following value:
[0230]
[0231] In the formula, These are the index values for E3 sorted in descending order. Ranking of diversity based on predicted distribution.
[0232] Step 17: To obtain a more comprehensive measure of diversity, DWedRVFL further defines diversity based on prediction performance. The diversity contribution index based on prediction performance is calculated as the difference between a single prediction and the median of all predictions. When calculating the bias, the true label at time t is unknown. Therefore, we use the latest prediction error to calculate the bias. The bias of Layer 3 at time (t-1) is... for:
[0233]
[0234] Will Sort in descending order to get the ranking of the 3rd layer:
[0235]
[0236]
[0237] in The ranking metrics are based on prediction performance and diversity ranking, respectively, and are 2 and 7. The overall diversity contribution ranking for Layer 3 is defined as:
[0238]
[0239] Where d3 = 0.0623 is the overall diversity contribution index of the third layer. The index value after reversing the order of d3 is equal to 3. D3 is the rank of the third layer based on the comprehensive diversity index, and it is 6.
[0240] Step 18: Combining the ranking information of F3 and D3, define the final contribution index r3 of the third output layer as:
[0241] r3=αF3+(1-α)D3,
[0242] Here, α is a trade-off parameter with a value of 0.6, so the final contribution index of the third layer is 4.8.
[0243] Step 19: DWedRVFL uses a ranking-based strategy to calculate the final ranking R3 and combined weights for Layer 3. .
[0244] r1≥r2≥…≥r3≥…≥r L ,
[0245]
[0246] in The index value of r3 in descending order is equal to 2.
[0247] Step 20: Calculate the combined predicted value at time t according to the sorting strategy. The combined weights of all layers, calculated using the method for the weights of the third-layer output layer, are (0.1818 0.0727 0.0364 0.0182 0.0545 0.0909 0.1091 0.1636 0.1273 0.1455). Therefore, the final combined prediction value is calculated as follows:
[0248]
[0249] The final combined predicted value was 1.1567, the true value was 1.9200, and the absolute error was 0.7633. The basic edRVFL, using the median to combine the predicted values of each layer, yielded a combined predicted value of 1.0969 with an absolute error of 0.8231. Therefore, DWedRVFL outperforms the basic edRVFL. Its ranking-based dynamic ensemble strategy provides a balanced allocation of combined weights, resulting in a more reasonable combined predicted value. The trained DWedRVFL was used for prediction on the test set, and the prediction results are as follows... Figure 13 As shown:
[0250] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0251] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0252] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0253] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0254] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention in any other way. Any person skilled in the art may make changes or modifications to the above-disclosed technical content to create equivalent embodiments. However, any simple modifications, equivalent changes, and modifications made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the protection scope of the present invention.
Claims
1. A building load prediction method based on forward decomposition denoising and deep ensemble learning, characterized in that, For the raw load data, the CEEMDAN algorithm is first used to forward decompose the load data into multiple IMFs components. After removing high-frequency noise components, the remaining components are reconstructed and fused to obtain clean load data. Then, to cope with the strong volatility and time-varying nature of building load data, a DWedRVFL neural network model is constructed. This model uses the latest prediction accuracy to capture the dynamic changes of the load sequence and constructs a diversity index to utilize the multi-scale features of different output layers. In addition, a ranking strategy is constructed to integrate the contributions of the latest accuracy and diversity, while avoiding the impact of abnormal predictions on the combined prediction. Finally, historical load, weather conditions, and occupancy data are combined into a connection matrix and fed into the DWedRVFL neural network model for learning, thereby improving the accuracy and stability of building load prediction. The implementation method of the DWedRVFL neural network model is as follows: Input data is Where N and m are the number of samples and the number of features in the input data; where P is defined as the number of neurons in the enhancement layer; the features of the first enhancement layer of DWedRVFL are represented as follows: Calculated by the following formula: Where g(⋅) represents a nonlinear activation function, The connection weights are randomly generated for the first enhancement layer; For a DWedRVFL model with L augmentation layers, the input of the l-th augmentation layer consists of the initial input features X and the output of the previous augmentation layer. If the composition is such that the calculation of the l-th enhancement layer is defined by the following formula: in, , The weights and enhancement features of the l-th enhancement layer are given. Let the output be the l-th layer; then the loss function of the l-th enhancement layer is expressed as: In the formula, The connection weights of layer l are... For regularization parameters; The calculation is performed analytically using the Moore-Penrose pseudo-inverse method or ridge regression method, and is expressed as: In the formula, To expand the input matrix, I is the identity matrix; each RVFL module in DWedRVFL corresponds to a decision subtask, and the output weights of each layer are calculated separately; all weights of the hidden layers in DWedRVFL are randomly generated before training begins and remain fixed during network training. To address the high volatility and time-varying characteristics of building loads, a dynamic integration module is designed. The outputs of different enhancement layers are combined by dynamically updating weights to achieve a trade-off between the latest accuracy and versatility. The prediction output of a network with L enhancement layers is... The design goal of the integrated module is to find a set of combined weights that minimizes the root mean square error (RMSE) between the predicted and actual values. The objective function is expressed as:
2. The building load prediction method based on forward decomposition denoising and deep ensemble learning according to claim 1, characterized in that, The implementation method of the CEEMDAN algorithm is as follows: Step 1: Let the collected raw load data be... Instead of decomposing the entire signal, CEEMDAN uses a forward rolling decomposition method. Specifically, it decomposes the data within a forward rolling window of size w, where the data only includes known load data and excludes future data. The load data used for decomposition is represented as follows: ; Step 2: Add a Gaussian white noise sequence to the original load data. Add a set of Gaussian white noise sequences to form new data S new As shown below: where i is the number of trials; G i is a normally distributed Gaussian noise sequence; γ0is the data-to-noise ratio, used to control the additional noise of the original dataset; Step 3: Perform S according to the EMD method new Decompose; from signal S new Extract local maxima and local minima, and calculate the mean M of the upper envelope U and lower envelope L based on the local extrema: Step 4: From the original signal S new Subtract the mean M from the mean to obtain the preliminary modal characteristic components h1: Step 5: Check if h1 satisfies the two conditions that the number of zero-crossing points should be equal to the number of extreme points and that amplitude symmetry is maintained. If not, treat it as a new signal and repeat steps 3 and 4 until the IMF condition is met. After convergence, the first component of EMD is initially obtained. Then, the first component IMF1 of CEEMDAN is obtained by calculating the mean, as shown below: Where T represents the total number of times white noise is added; Step 6: Calculate the first residual term r1, as shown below: Step 7: Add noise to the first residual term r1 to get a new sequence The new sequence is decomposed to get the second component IMF2 and the second residual term r2 according to the following equation: Step 8: Calculate the k-th order residual term r using the same method as in Step 7. k and the (k+1)th order component IMF k+1 : Step 9: Repeat step 8 until the obtained residual term can no longer be decomposed. The final residual term R is calculated as follows: Step 10: Assuming component IMF1 is the high-frequency noise component that needs to be removed, reconstruct the clean load data by combining the remaining components (excluding IMF1) and the residual term R. Step 11: X L X represents the denoised historical load sequence. n and X w These represent building occupancy data obtained from the attendance tracking system and weather forecast data obtained from the BEMS system, respectively; the predictive modeling objective is to establish the following mapping relationship: In the formula, Let be the predicted value for the next time step, and 'e' be the error; the rolling window 'w' determines the amount of historical load data used; where the historical load sequence is represented as... And X n , It is a scalar; [ Forming a connection matrix As input to the prediction model, the output is the predicted load value for the next time step. 3.The building load forecasting method based on forward decomposition denoising and deep ensemble learning of claim 1, wherein, To enhance the adaptability of DWedRVFL to the time-varying characteristics of building loads, DWedRVFL introduces a contribution index based on the latest accuracy, defined as a function of the latest measurement prediction error: In the formula, Let represent the contribution index of the l-th layer at time t-1 based on the latest accuracy. To prevent the weight of the dominant output layer from being too large, a sorting method is used to sort the contribution index values of each layer from largest to smallest, resulting in the following order: Then, the latest precision contributions of the l-th layer are sorted as follows: in, This represents the index value after sorting at level l. This indicates the ranking of their contributions to the latest accuracy; if the latest accuracy contribution of the l-th layer is the largest, then its ranking index is 1. Equal to L; for each output layer, the rank value The higher the value, the greater its contribution to the latest accuracy, indicating that it has a higher level of accuracy.
4. The building load prediction method based on forward decomposition denoising and deep ensemble learning according to claim 3, characterized in that, DWedRVFL introduces diversity to ensure the effective utilization of multi-scale enhancement features from different output layers, thereby effectively addressing the volatility of building loads. DWedRVFL promotes overall forecast diversity by assigning greater weight to output layers that contribute to diversity. First, the Euclidean distance between each layer's forecast and other layer forecasts is defined as the diversity contribution. in, The sum of the Euclidean distances between the predictions of layer l and other layers at time t is given by the prediction distribution. Defined as a diversity contribution indicator; then sorted in descending order of diversity contribution indicators, the resulting order is as follows: The diversity ranking of the l-th layer is then assigned the following value: In the formula, for Index sorted in descending order This is a diversity ranking based on the predicted distribution; for output layer l, the greater its contribution to diversity, The larger; To obtain a more comprehensive measure of diversity, DWedRVFL further defines diversity based on prediction performance. The diversity contribution index based on prediction performance is calculated as the difference between a single prediction and the median of all predictions. When calculating bias, the true label at time t is unknown; therefore, the latest prediction error is used to calculate the bias. The bias of the l-th layer at time (t-1) is... for: Will Sort in descending order to obtain the ranking of the l-th layer: in, and These are the index and diversity contribution rankings based on prediction performance, respectively. Based on the above two diversity calculation results, the overall diversity contribution ranking of layer l is defined as: in, The contribution index to the overall diversity of the l-th layer, for The index value after reversing the sorting order; This represents the ranking value of the l-th layer based on the comprehensive diversity index. In summary and the ranking information, defining the final contribution indicator of the l-th output layer is: wherein is a trade-off parameter; DWedRVFL uses a ranking-based strategy to calculate the final ranking of the l-th layer. and combined weights ; wherein is the index value after descending order; The combined predicted value of DWedRVFL at time t is calculated based on the sorting strategy. The calculation method is as follows: