A privacy protection method for power meter power consumption data based on federated learning

By using federated learning and encryption services to protect electricity meter data in the power system, and employing homomorphic encryption and perturbation data technology, the problem of sensitive data leakage in power system electricity consumption forecasting is solved, achieving data privacy protection and accurate electricity consumption forecasting.

CN115795501BActive Publication Date: 2026-06-16STATE GRID INFORMATION & TELECOMM BRANCH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
STATE GRID INFORMATION & TELECOMM BRANCH
Filing Date
2022-11-11
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies for predicting electricity consumption in power systems pose a risk of leakage of sensitive data privacy from various parties, especially in centralized linear regression models where data is processed on cloud servers.

Method used

A federated learning-based approach is adopted, in which the data owners of electricity consumption data from various meters in the power system collaboratively build an electricity consumption prediction model. The encryption service provider provides encryption and decryption services, and homomorphic encryption and perturbation data are used to protect data privacy. The server calculates the parameters of the electricity consumption prediction model, ensuring that the data is preprocessed, encrypted and aggregated locally, and perturbation data is added to prevent the leakage of real parameters.

🎯Benefits of technology

In a federalized scenario, this approach effectively protects the electricity consumption data of data owners and the parameters of electricity consumption prediction models, preventing the leakage of sensitive information on cloud servers and encryption service providers, and ensuring accurate electricity consumption prediction while ensuring data sharing.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115795501B_ABST
    Figure CN115795501B_ABST
Patent Text Reader

Abstract

The application belongs to the technical field of data privacy protection, and in particular relates to a power meter power consumption data privacy protection method based on federated learning. In a federated scenario, a power consumption prediction model is established, data is provided by a data owner, encryption and decryption services are provided by an encryption service provider, and parameters of the power consumption prediction model are calculated by a server. In the process of calculating the parameters of the power consumption prediction model, the data owner directly preprocesses the data locally and performs homomorphic encryption and aggregation. The server only obtains a set of encrypted and aggregated intermediate quantities, and cannot infer any sensitive information about the local power meter power consumption data from the set of encrypted and aggregated intermediate quantities. The server adds first perturbation data, and the encryption service provider adds second perturbation data to the third intermediate quantity data, so that neither the encryption service provider nor the server can obtain the real parameters of the power consumption prediction model, and the power meter power consumption data of the data owner and the parameters of the power consumption prediction model are well protected.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data privacy protection technology, specifically relating to a privacy protection method for electricity meter usage data based on federated learning. Background Technology

[0002] Power companies' production plans are often based on empirical data. Therefore, the forecast of electricity consumption for the following year is crucial. If the forecast of electricity consumption for the following year is inaccurate, two situations may occur: one is that the power plant generates too much electricity, resulting in unnecessary energy waste; the other is that the power generation is insufficient, failing to meet the electricity needs of various industries and people's lives.

[0003] The advancement and widespread use of linear regression have greatly facilitated electricity consumption forecasting for power companies. Linear regression is a statistical analysis method that uses regression analysis in mathematical statistics to determine the quantitative relationship of interdependence between two or more variables, and it is widely used.

[0004] For example, electricity consumption in the power system can be regarded as the dependent variable, while factors that may affect electricity consumption, such as economic indicators, population, and climate, can be regarded as independent variables. Studying the relationship between a single independent variable and the dependent variable is called univariate linear regression analysis, while studying the relationship between two or more independent variables and the dependent variable is called multiple linear regression.

[0005] Currently, linear regression models in power systems are typically executed in a centralized manner, meaning that the private data of enterprises at all administrative levels are collected and processed on cloud servers. Although centralized linear regression models are convenient and efficient, they may also pose a privacy risk of leaking sensitive data held by various parties.

[0006] In the prior art, Chinese patent document CN115081540A discloses a data privacy-protected classification and grading method and system based on ensemble decision learning. This method involves scrambling the user-side device data category attribute parameters required for training a decision tree model and then sending them to a cloud server for data integration via symmetric encryption. The cloud server integrates the category attributes uploaded by all user-side devices without decryption, and then distributes the ciphertext after integration calculation to each user-side device. The user-side devices use the decrypted data to train the ensemble decision learning decision tree model. The trained decision tree model is then used for classification and grading of local data on the user-side devices. However, this method carries the risk of data leakage during encryption and on the cloud server.

[0007] Chinese patent document CN115174115A discloses a method for managing electricity demand response data. This method includes the following steps: receiving data, which includes data information generated by electricity demand response services and large file-type data; uploading the data to a blockchain according to the information type: when the information is public information, sending the plaintext of the information to the blockchain; when the information is private information, sending the encrypted information to the blockchain; and sending the large file-type data to the blockchain. This method uses blockchain technology to solve the privacy problem of electricity data, requires the establishment of a blockchain network, has high technical requirements for data processing, and increases the cost of electricity data processing.

[0008] Chinese patent document CN115098883A discloses a data privacy protection method and system based on secure multi-party computation. The method includes: a requesting party sending a survey request to an administrator, wherein the survey request includes set standard conditions; the administrator selecting target objects that meet the standard conditions from data holders based on the survey request; the administrator and data holders performing secure multi-party computation based on the target data and target analysis model of the target objects to obtain corresponding analysis results, and transmitting the analysis results to the requesting party. In this method, the data is obtained by the administrator, who sets the filtering conditions. On the one hand, data privacy can only be guaranteed if the administrator has high credibility; on the other hand, some useful but private data may be filtered out, which is detrimental to the accuracy of the analysis results. Summary of the Invention

[0009] This invention aims to provide a privacy protection method for electricity meter usage data based on federated learning, which solves the technical problem of privacy leakage of sensitive data owned by various parties when using statistical data from the power system to predict electricity consumption in the prior art.

[0010] To solve the above-mentioned technical problems, the present invention adopts the following technical solution:

[0011] This paper provides a privacy protection method for electricity meter usage data based on federated learning. In a federated scenario, the data owners of electricity meter usage data in the power system collaboratively establish an electricity consumption prediction model. Based on this model, future electricity consumption can be predicted using past meter usage data. The method involves the data owners providing the data, an encryption service provider providing encryption / decryption services, and a server calculating the parameters of the electricity consumption prediction model. Specifically, the method includes:

[0012] The encryption service provider distributes the public key to both the data owner and the server.

[0013] The data owner of the electricity meter data preprocesses the local data into first intermediate raw data. The first intermediate raw data is encrypted and aggregated through homomorphic encryption to obtain second intermediate data. The data owner uses a public key to encrypt the second intermediate data and then uploads it to the server.

[0014] The server obtains the second intermediate data, decrypts it using a public key, and then adds a first perturbation to the second intermediate data to obtain the third intermediate data. The server encrypts the third intermediate data using a public key and sends it to the encryption service provider. The encryption service provider decrypts the third intermediate data using a public key to obtain the third intermediate data.

[0015] The encryption service provider adds a second perturbation data to the third intermediate data to obtain a fourth intermediate data. The encryption service provider encrypts the fourth intermediate data using a public key and sends it to the server. The server uses a public key to decrypt the fourth intermediate data to obtain inaccurate model parameters for training the electricity consumption prediction model.

[0016] The data owner obtains the inaccurate model parameters and the first disturbance data. The data owner then eliminates the disturbance in the inaccurate model parameters based on the first disturbance data to obtain the accurate model parameters of the electricity consumption prediction model.

[0017] Preferably, the data owner of the electricity meter usage data preprocesses the local data into first intermediate raw data, including:

[0018] (1) The data owner provides a training dataset m consisting of M types of feature data, wherein the training dataset m records the electricity consumption values ​​corresponding to different features of historical electricity consumption data;

[0019] The training dataset m is D = {(x1,y1),(x2,y2),...,(x...} m ,y m )}, where x i =(x i1 ,x i2 ,...,x in ), x i Each element in the data records a characteristic value of historical electricity consumption data, y i This corresponds to the electricity consumption value;

[0020] (2) Establish a multiple linear regression model h w (x i )=w·x i The multiple linear regression model is the electricity consumption prediction model mentioned above, h w (x iLet w be the electricity consumption predicted by a multiple linear regression model, where w = (w0, w1, ..., w...). n The parameters in parameter set w are the regression coefficients of the multiple linear regression model. This parameter set is learned by training the training dataset m. The parameter set w makes the samples in D best fit.

[0021] (3) When the parameter set w is obtained by training from the training dataset m, the optimal parameter set w is solved based on the following cost function f(w) and the coordinate descent method expressed by formulas (1), (2) to (n), where the cost function f(w) is the sum of squared errors between the actual electricity consumption value and the fitted electricity consumption value:

[0022]

[0023]

[0024]

[0025]

[0026] Here, t represents the current iteration number, and the cost function f(w) is used to evaluate w. k Solve the objective function argmin using the partial derivatives, where w k This represents the k-th parameter in the currently evaluated parameter set w.

[0027]

[0028] Apply the cost function f(w) to w k The partial derivative contains the intermediate quantity P of privacy data information. k and Z k Extracted:

[0029]

[0030] x here ij w represents the j-th feature value of the i-th data in the training dataset m; j This represents the j-th regression coefficient of the multiple linear regression model.

[0031] The intermediate quantity P k and Z k That is, the original data of the first intermediate quantity.

[0032] Preferably, the inaccurate model parameters of the electricity consumption prediction model obtained through training include:

[0033] Server command Solution obtained:

[0034]

[0035] For w k The optimal regression coefficients are derived through iterative operation of the coordinate descent method using the above solution.

[0036] Preferably, the characteristic values ​​of the historical electricity consumption data include time, active power per minute, and average voltage per minute.

[0037] Preferably, the data owners are the power companies at the administrative region level, and the electricity consumption data of each party are private data that must be kept private; the server is the State Grid's dedicated server "State Grid Cloud", where the data of each data owner is uploaded for model training; the encryption service provider is an encryption service provider program composed of hardware and software with general encryption functions, used to provide keys for the server and the data owners.

[0038] Compared with existing technologies, the beneficial effects of this invention are as follows: In a federated learning-based method for protecting the privacy of electricity meter data, the data owners of each electricity meter in the power system collaboratively establish an electricity consumption prediction model. The data owners provide the data, the encryption service provider provides encryption and decryption services, and the server calculates the parameters of the electricity consumption prediction model. During the calculation of the electricity consumption prediction model parameters, both the data owner's electricity meter data and the parameters of the electricity consumption prediction model are well protected. Specifically, this is reflected in the following aspects: 1. The data owner preprocesses the data locally into first intermediate raw data and performs homomorphic encryption and aggregation. Due to the homomorphic encryption and aggregation processing, the cloud server only receives a set of encrypted and aggregated intermediate data, and cannot infer any sensitive information about the local electricity meter data from it. 2. The server adds first perturbation data, preventing the encryption service provider from obtaining the true second intermediate data. Therefore, it cannot infer the true parameters of the electricity consumption prediction model, preventing the leakage of the true parameters of the electricity consumption prediction model on the encryption service provider. 3. By adding a second perturbation data to the third intermediate data through the encryption service provider, the server can only derive parameters of the inaccurate electricity consumption prediction model, preventing the leakage of the true parameters of the electricity consumption prediction model on the server side. The design of the first and second perturbation data allows the server to derive inaccurate parameters, but each data owner can obtain the true parameters based on the inaccurate parameters and the first and second perturbation data. Attached Figure Description

[0039] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:

[0040] Figure 1 This is a flowchart of an embodiment of the privacy protection method for electricity meter usage data based on federated learning according to the present invention. Detailed Implementation

[0041] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0042] In one embodiment, a privacy protection method for electricity meter usage data based on federated learning is provided. Here, federated learning is a distributed machine learning technology that trains a global model across multiple data sources with local data. Without exchanging local individual or sample data, the global model is constructed by exchanging model parameters or intermediate results, thereby achieving a balance between data privacy protection and data sharing computation.

[0043] This federated learning-based method for protecting the privacy of electricity meter data, in a federated scenario, involves data owners across all electricity meters in the power system collaboratively building an electricity consumption prediction model. This model then predicts future electricity consumption based on past meter data. The method involves data owners providing the data, encryption / decryption services provided by an encryption service provider, and the server calculating the parameters of the electricity consumption prediction model. Figure 1 As shown, it specifically includes:

[0044] Step S1: The encryption service provider distributes the public key to the data owner and the server.

[0045] The encryption service provider is an encryption service provider consisting of hardware and software with general encryption functions, used to provide keys to the server and data owners. The encryption service provider, data owners, and server are three independent parties. The data owners are the power companies at the administrative region level; the electricity meter data of each party is private data and must be kept confidential. The server is the State Grid's dedicated server, "State Grid Cloud," where data from all data owners is uploaded for model training.

[0046] Step S2: The data owner of the electricity meter data preprocesses the local data into the first intermediate raw data. The first intermediate raw data is encrypted and aggregated through homomorphic encryption to obtain the second intermediate data. The data owner uses the public key to encrypt the second intermediate data and then uploads it to the server.

[0047] If data is collected directly from each data owner and the first intermediate raw data is calculated on a cloud server, it would lead to the leakage of the data owner's privacy data. Therefore, the data owner preprocesses the data into the first intermediate raw data locally. The first intermediate raw data is associated with the data owner's electricity meter data.

[0048] Meanwhile, to enhance the security of the original first intermediate data, it is also homomorphically encrypted and aggregated. The second intermediate data after homomorphic encryption and aggregation is related to the first intermediate data. Due to the homomorphic encryption and aggregation processing, the cloud server only obtains one set of encrypted and aggregated intermediate data, namely the second intermediate data, and cannot infer any sensitive information about the local electricity meter's electricity consumption data from it.

[0049] Step S3: The server obtains the second intermediate data, decrypts it using the public key, adds the first perturbation data to the second intermediate data to obtain the third intermediate data, encrypts the third intermediate data using the public key and sends it to the encryption service provider, who then decrypts the third intermediate data using the public key to obtain the third intermediate data.

[0050] Step S4: The encryption service provider adds second perturbation data to the third intermediate data to obtain the fourth intermediate data. The encryption service provider uses the public key to encrypt the fourth intermediate data and sends it to the server. The server uses the public key to decrypt the fourth intermediate data to obtain the inaccurate model parameters of the electricity consumption prediction model.

[0051] Although the local private data of the data owner is further protected by homomorphic encryption and aggregation, it is also crucial to protect the true parameters of the electricity consumption prediction model from being leaked to the server and encryption service provider. Therefore, two perturbation data are added in steps S3 and S4.

[0052] If the encryption service provider directly obtains the second intermediate data, it can also infer the true parameters of the electricity consumption prediction model. Therefore, to prevent the leakage of the true parameters of the electricity consumption prediction model on the encryption service provider, the server adds the first perturbation data. The encryption service provider cannot obtain the true second intermediate data, and thus it cannot infer the true parameters of the electricity consumption prediction model. Similarly, by adding the second perturbation data to the third intermediate data through the encryption service provider, the server can only derive inaccurate parameters of the electricity consumption prediction model. Due to the existence of these two perturbation data, neither the encryption service provider nor the server can obtain the true parameters of the electricity consumption prediction model, preventing the leakage of the true parameters of the electricity consumption prediction model on either side.

[0053] Step S5: The data owner obtains the inaccurate model parameters and disturbance data. The data owner eliminates the disturbance in the inaccurate model parameters based on the disturbance data to obtain the accurate model parameters of the electricity consumption prediction model.

[0054] The design of the first and second perturbation data above must ensure that the server can only derive inaccurate parameters, but each data owner can obtain the real parameters based on the inaccurate parameters and the first and second perturbation data.

[0055] This approach introduces an encryption service provider to offer encryption and decryption services, enabling the server to process the acquired data for privacy purposes. The server also ensures the protection of its computing power and algorithms, and integrates the data from various data owners to achieve the goal of establishing an electricity consumption prediction model.

[0056] In one embodiment, the electricity consumption prediction model established in the federated learning-based method for protecting electricity meter consumption data is a multiple linear regression model. Thus, the data owner of the electricity meter consumption data preprocesses the local data into first intermediate raw data, including:

[0057] (1) The data owner provides a training dataset m consisting of M types of feature data. The training dataset m records the electricity consumption values ​​corresponding to different features of historical electricity consumption data.

[0058] The training dataset m is D = {(x1,y1),(x2,y2),...,(x m ,y m )}, where x i =(x i1 ,x i2 ,...,x in ), x i Each element in the data records a characteristic value of historical electricity consumption data, y i This is the corresponding electricity consumption value.

[0059] The historical electricity consumption data here includes the following key characteristics: time, active power per minute, and average voltage per minute.

[0060] (2) Establish a multiple linear regression model h w (x i )=w·x i This multiple linear regression model is the electricity consumption prediction model, h w (x i Let w be the electricity consumption predicted by a multiple linear regression model, where w = (w0, w1, ..., w...). nThe parameters in parameter set w are the regression coefficients of the multiple linear regression model. This parameter set is learned by training the training dataset m. The parameter set w makes the samples in D the best fit.

[0061] (3) When the parameter set w is learned from the training dataset m, the optimal parameter set w is solved based on the following cost function f(w) and the coordinate descent method expressed by formulas (1), (2) to (n), where the cost function f(w) is the sum of squared errors between the actual electricity consumption value and the fitted electricity consumption value:

[0062]

[0063] Here, t represents the current iteration number, and the cost function f(w) is used to evaluate w. k Solve the objective function argmin using the partial derivatives, where w k This represents the k-th parameter in the currently evaluated parameter set w.

[0064]

[0065] Apply the cost function f(w) to w k The partial derivative contains the intermediate quantity P of privacy data information. k and Z k Extracted:

[0066]

[0067] x here ij w represents the j-th feature value of the i-th data in the training dataset m; j This represents the j-th regression coefficient in the multiple linear regression model;

[0068] The intermediate quantity P here k and Z k That is, the original data of the first intermediate quantity.

[0069] Then, the inaccurate model parameters of the electricity consumption prediction model trained in this federated learning-based method for protecting the privacy of electricity meter data include:

[0070] Server command Solution obtained:

[0071]

[0072] For w k The optimal regression coefficients are derived by the server through iterative execution of the coordinate descent method using the above solution. Here, the intermediate quantity P... k and Z kIt is the original data of the fourth intermediate quantity obtained after a series of processing.

[0073] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A privacy protection method for electricity meter usage data based on federated learning, characterized in that, In a federal power system, this method involves data owners collaboratively establishing a power consumption prediction model based on electricity meter usage data. This model is used to predict future power consumption based on past meter usage data. The method utilizes data provided by the data owners, encryption / decryption services provided by an encryption service provider, and the server calculating the parameters of the power consumption prediction model. Specifically, this includes: The encryption service provider distributes the public key to both the data owner and the server. The data owner of the electricity meter data preprocesses the local data into first intermediate raw data. The first intermediate raw data is encrypted and aggregated through homomorphic encryption to obtain second intermediate data. The data owner uses a public key to encrypt the second intermediate data and then uploads it to the server. The server obtains the second intermediate data, decrypts it using a public key, and then adds a first perturbation to the second intermediate data to obtain the third intermediate data. The server encrypts the third intermediate data using a public key and sends it to the encryption service provider. The encryption service provider decrypts the third intermediate data using a public key to obtain the third intermediate data. The encryption service provider adds a second perturbation data to the third intermediate data to obtain a fourth intermediate data. The encryption service provider encrypts the fourth intermediate data using a public key and sends it to the server. The server uses a public key to decrypt the fourth intermediate data to obtain inaccurate model parameters for training the electricity consumption prediction model. The data owner obtains the inaccurate model parameters and the first disturbance data. The data owner then eliminates the disturbance in the inaccurate model parameters based on the first disturbance data to obtain the accurate model parameters of the electricity consumption prediction model.

2. The privacy protection method for electricity meter usage data based on federated learning according to claim 1, characterized in that, The data owner of the electricity meter's electricity consumption data preprocesses the local data into the first intermediate quantity of raw data, which includes: (1) The data owner provides a training dataset m consisting of M types of feature data, wherein the training dataset m records the electricity consumption values ​​corresponding to different features of historical electricity consumption data; The training dataset m is ,in , Each element in the database records a characteristic value of historical electricity consumption data. This corresponds to the electricity consumption value; (2) Establish a multiple linear regression model This multiple linear regression model is the electricity consumption prediction model. The electricity consumption value is predicted by a multiple linear regression model, where Each parameter in parameter set w is a regression coefficient of the multiple linear regression model. This parameter set is learned by training the training dataset m. Parameter set w makes the samples in D best fit. (3) When the parameter set w is obtained by training from the training dataset m, the optimal parameter set w is solved based on the following cost function f(w) and the coordinate descent method expressed by formulas (1), (2) to (n), where the cost function f(w) is the sum of squared errors between the actual electricity consumption value and the fitted electricity consumption value: ; ,(1); ,(2); ...... , (n); Here, t represents the current iteration number, and the cost function f(w) is used to evaluate w. k Solve the objective function argmin using the partial derivatives, where w k This represents the k-th parameter in the currently evaluated parameter set w. ; Apply the cost function f(w) to w k The partial derivative contains the intermediate quantity P of privacy data information. k and Z k Extracted: ; Here w represents the j-th feature value of the i-th data in the training dataset m; j This represents the j-th regression coefficient of the multiple linear regression model. The intermediate quantity P k and Z k That is, the original data of the first intermediate quantity.

3. The privacy protection method for electricity meter usage data based on federated learning according to claim 2, characterized in that, The inaccurate model parameters of the electricity consumption prediction model obtained from the training include: Server command The solution is obtained as follows: ; For w k The optimal regression coefficients are derived through iterative operation of the coordinate descent method using the above solution.

4. The privacy protection method for electricity meter usage data based on federated learning according to claim 2, characterized in that, The characteristic values ​​of the historical electricity consumption data include time, active power per minute, and average voltage per minute.

5. The privacy protection method for electricity meter usage data based on federated learning according to claim 1, characterized in that, The data owners are the power companies at the administrative region level, and the electricity consumption data of each party is private data and must be kept private; the server is the State Grid's dedicated server "State Grid Cloud", where the data of each data owner is uploaded for model training; the encryption service provider is an encryption service provider program composed of hardware and software with encryption functions, used to provide keys for the server and data owners.