Smart meter data collection method and system based on personalized local differential privacy
By employing a personalized, local differential privacy-based smart meter data collection method, users can customize their privacy protection level and allocate a privacy budget. By combining a sampling mechanism and K-random response technology, the balance between privacy protection and data availability in smart meter data collection is resolved, thereby improving the security and accuracy of data collection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHONGQING UNIV
- Filing Date
- 2025-08-27
- Publication Date
- 2026-06-19
AI Technical Summary
In the existing smart meter data collection process, how can we improve individual privacy protection while maintaining the availability and effectiveness of overall data, especially in large-scale smart meter networks where key management is difficult and computing power is insufficient?
A smart meter data collection method based on personalized local differential privacy is adopted. By allowing users to customize the privacy protection level and allocate a corresponding privacy budget, and by combining a sampling mechanism and K-random response technology, the data collection frequency and perturbation method are dynamically adjusted to achieve a balance between personalized privacy protection and data utility.
It enables dynamic adjustment of privacy protection strength based on user needs, improves data availability and accuracy, reduces the demand for computing power, avoids dependence on rechargeable batteries, and enhances the security and flexibility of data collection.
Smart Images

Figure CN120951385B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data privacy protection technology, and in particular to a method and system for collecting smart meter data based on personalized local differential privacy. Background Technology
[0002] The smart grid has emerged as the core carrier of the next-generation power system. It deeply integrates advanced sensing, communication, information, and control technologies, aiming to achieve a high degree of integration and two-way interaction between power flow, information flow, and business flow.
[0003] Smart meters are the key entry point for realizing this vision. They not only replace traditional mechanical meters, but are also intelligent terminal devices that integrate electricity metering, data acquisition, information storage, two-way communication and remote control functions.
[0004] Its fundamental function is to achieve refined energy management: by collecting detailed electricity consumption data from users at high frequency and automatically (usually every 15 minutes or 1 hour), and transmitting this data to the power grid company's data center through a secure communication network. This refined data flow has completely changed the traditional power grid operation model.
[0005] However, the high-precision, high-frequency electricity consumption data collected by smart meters is a double-edged sword, posing a serious challenge to user privacy protection. This data (especially when collected at a sufficiently high frequency) can extremely accurately "depict" household activity patterns and lifestyles—when users are home, how long they are away, and even predict the usage time and frequency of specific appliances (such as air conditioners, ovens, and washing machines). This could indirectly expose users' daily routines, health conditions (such as electricity consumption by medical devices), household composition, and even abnormal sleep patterns. Unauthorized or improper use of this data could lead to user profiling, behavior prediction, targeted marketing harassment, and even more serious discriminatory pricing or security risks, such as using electricity consumption patterns to determine if a home is empty for theft.
[0006] Therefore, while vigorously developing smart grids and leveraging the data dividend, it is crucial to build and strictly enforce a robust privacy protection framework. Current privacy protection measures in smart meter data collection include:
[0007] Early privacy protection methods such as anonymization and data desensitization have become ineffective in protecting user privacy due to the increasing computing power of attackers and various background knowledge-based attacks. As for encryption-based methods, smart meters have limited computing power, insufficient to support complex encryption algorithms, and key management in large-scale smart meters is also extremely difficult.
[0008] Current solutions for protecting privacy during smart meter data collection primarily fall into two categories: encryption-based and differential privacy-based solutions. Encryption-based solutions mainly employ symmetric encryption (AES), asymmetric encryption (RSA, ECC), or hybrid encryption (such as TLS / SSL protocols) to ensure data transmission security, and combine this with digital signature technology to ensure data integrity. However, these solutions suffer from several drawbacks: efficient mechanisms are required for key distribution, rotation, and storage in large-scale smart meter networks; otherwise, they can easily become single points of failure. Furthermore, running encryption algorithms within embedded meter systems may increase power consumption and affect device lifespan; high computational demands can lead to response delays; and encryption algorithms require significant computing power, which is generally insufficient for smart meters to support robust encryption algorithms.
[0009] Differential privacy is a privacy-preserving technical framework with rigorous mathematical definitions and robust privacy measures. It aims to provide strong privacy protections when publishing or analyzing personal data, while maintaining data availability and validity.
[0010] Differential privacy protects sensitive personal information by introducing noise or perturbations to hide individual contributions. This protection of randomness makes it difficult for attackers to determine the sensitive information of a specific individual. The key idea of differential privacy is to protect privacy with a degree of uncertainty while providing statistical data or analytical results. Differential privacy focuses on protecting the privacy of each individual, not just the dataset as a whole. It provides individual-specific privacy protection without directly disclosing sensitive information about individuals when publishing or analyzing data. Differential privacy provides a quantifiable measure of privacy protection called the privacy budget. Simply put, if an algorithm satisfies differential privacy, an attacker will not be able to derive information about the input from the differences in the algorithm's output.
[0011] Differential privacy has been adopted by companies like Microsoft, Google, and Apple, which collect data while protecting user privacy. Localized differential privacy is an extension of centralized differential privacy, with the key difference being that it does not require a trusted third party. In the localized differential privacy model, each user directly perturbs the original data on their local device before sending the perturbed data to the server.
[0012] Differential privacy-based solutions primarily include data perturbation schemes based on rechargeable and dischargeable batteries. These schemes utilize batteries as a "physical noise source," smoothing the actual power consumption curve through charging and discharging behavior, masking the start-stop characteristics of appliances, and compromising the identifiability of power consumption trajectories. However, battery hardware is expensive, and its limited capacity cannot fully cover peak power demand, potentially leading to insufficient privacy protection and a decrease in overall data usability.
[0013] Therefore, how to improve individual privacy protection while maintaining the overall availability and effectiveness of data has become an urgent problem to be solved during the data collection process. Summary of the Invention
[0014] This application aims to address the shortcomings of existing technologies by proposing a smart meter data collection method and system based on personalized local differential privacy. Based on the user's personalized settings, this application categorizes the user's privacy protection needs into different levels. Users can select their own privacy protection needs and allocate corresponding privacy budgets, thereby achieving personalized privacy protection while improving data availability.
[0015] To achieve the objectives of this application, in a first aspect, this application provides a method for collecting smart meter data based on personalized local differential privacy, comprising:
[0016] Smart meters obtain privacy budgets based on privacy protection levels;
[0017] The aggregator calculates a privacy budget threshold based on the privacy budget.
[0018] Smart meters calculate the probability of being sampled based on a privacy budget threshold and a privacy budget.
[0019] And it will decide whether to participate in this round of electricity consumption data collection based on the probability of being sampled;
[0020] If you do not participate, you will have to wait for the next round of electricity data collection.
[0021] If involved, electricity consumption data is acquired and perturbed to obtain perturbed electricity consumption data;
[0022] The aggregator estimates the total electricity consumption for this round based on the disturbance power consumption data.
[0023] Furthermore, the privacy protection levels include a high privacy protection level, a medium privacy protection level, and a low privacy protection level, with privacy budgets of 0.1, 0.5, and 1 corresponding to the high privacy protection level, the medium privacy protection level, and the low privacy protection level, respectively.
[0024] It allows users to modify the privacy protection level (high / medium / low) at runtime, with the corresponding privacy budget ε dynamically adjusted to 0.1 / 0.5 / 1. This enables a real-time balance between privacy protection and data utility, allowing users to flexibly adjust the settings based on usage scenarios (such as high privacy requirements at night).
[0025] Furthermore, the formula for calculating the privacy budget threshold is as follows:
[0026]
[0027] Where E is the sum of all privacy budgets collected by the aggregator; Privacy budget threshold; This indicates that the mean value is calculated.
[0028] By using the average of the total privacy budget as a threshold, a baseline based on the average level of the overall privacy budget is provided for privacy protection, which plays a role in balancing and regulating privacy protection operations and ensuring that privacy protection is carried out within a reasonable and stable range.
[0029] Furthermore, the step of determining whether to participate in this round of electricity consumption data collection based on the probability of being sampled includes:
[0030] The smart meter randomly generates a comparison value and compares it with the probability of being sampled to obtain a comparison result. Based on the comparison result, it decides whether to participate in this round of electricity consumption data collection.
[0031] By comparing the randomly generated comparison value of the smart meter with the sampling probability to determine whether to participate, the randomness of data collection is increased, the security of the data collection process is improved, and attackers can prevent the acquisition of pattern information by predicting participating devices, thus protecting the privacy of users' electricity data.
[0032] Furthermore, the probability of being sampled is calculated using the following formula:
[0033]
[0034] Where i is the smart meter index. Let be the probability that the i-th smart meter is sampled. Let δ be the privacy budget for the i-th smart meter, and let δ be the privacy budget threshold.
[0035] By dynamically adjusting the data collection frequency, the privacy amplification effect of the sampling mechanism can be used to significantly improve the level of privacy protection.
[0036] Furthermore, the step of determining whether to participate in this round of electricity consumption data collection based on the probability of being sampled also includes:
[0037] The smart meter randomly generates a random value. If this random value is greater than... If the random value is less than or equal to the specified value, the smart meter will not participate in this round of electricity consumption data collection; if the random value is less than or equal to the specified value, the smart meter will not participate in this round of electricity consumption data collection. If so, the meter will participate in the collection of electricity consumption data for this round.
[0038] The lightweight implementation (requiring only one random number generation) greatly reduces the computational power requirements. The probability determination results are untraceable, avoiding the need to infer user privacy preferences through participation records and reducing system complexity.
[0039] Furthermore, the disturbance to the electricity consumption data includes:
[0040] After recording each round of electricity consumption data, the smart meter determines the electricity consumption interval to which the data belongs. The electricity consumption interval includes a lower limit and an upper limit. The electricity consumption data is discretized into the lower limit with probability based on how close it is to the lower limit, and into the upper limit with probability based on how close it is to the upper limit, thus obtaining a discretization result. The discretization result is then perturbed using a K-random response to obtain perturbed electricity consumption data.
[0041] The dual protection mechanism enhances the security of data privacy protection.
[0042] Furthermore, the aggregator's estimation of the total electricity consumption for this round based on the disturbance power consumption data also includes:
[0043] The aggregator obtains the maximum power consumption and the total number of smart meters, and determines the number of intervals based on the privacy budget threshold. The aggregator calculates the interval length and boundary value based on the number of intervals and the maximum power consumption.
[0044] The aggregator counts the frequency of occurrence of boundary values in each interval and the number of sampled smart meters based on the disturbed electricity consumption data. It then uses the frequency of occurrence, privacy budget threshold, and number of sampled smart meters to calculate the true frequency of each boundary value. Based on the total number of meters, the boundary values of each interval, and the true frequency, it estimates the total electricity consumption for this round.
[0045] Eliminating sampling bias significantly improves the accuracy and usability of electricity consumption data and overall electricity consumption data for this round.
[0046] Secondly, this application provides a smart meter data collection system based on local differential privacy to implement the above method. The system includes multiple smart meters and an aggregator communicatively connected to the multiple smart meters.
[0047] The beneficial effects of this application are:
[0048] This application uses a privacy budget to perturb users, applying a smaller budget to increase privacy protection for users with high privacy needs and a larger budget to decrease privacy protection for users with low privacy needs. Simultaneously, it leverages the privacy amplification effect of a sampling mechanism to uniformly ensure the strength of privacy protection. This allows users to customize their own privacy protection level, providing greater personalization and flexibility, and making it suitable for more situations.
[0049] By customizing their own privacy protection levels, users avoid using a small privacy budget for a group of users with high privacy requirements, reducing noise injection and improving data availability and accuracy.
[0050] Meanwhile, for a single round of data collection, smart meters only need to perform a limited number of calculations to complete the task, even with low computing power. The system calculates whether sampling is required based on a privacy budget and its threshold, leveraging the privacy amplification effect of the sampling mechanism to enhance privacy protection. Furthermore, personalized differential privacy and k-random response techniques are used to jointly protect privacy, balancing the strength of privacy protection with usability. Attached Figure Description
[0051] Figure 1 The flowchart of the smart meter data collection method based on personalized local differential privacy of this invention;
[0052] Figure 2 The present invention provides a flowchart of the smart meter workflow in a smart meter data collection system based on personalized local differential privacy.
[0053] Figure 3 The present invention presents a flowchart of the aggregator's workflow in a smart meter data collection system based on personalized local differential privacy. Detailed Implementation
[0054] Embodiments of the present invention are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.
[0055] In the description of this invention, it should be understood that the terms "longitudinal", "lateral", "up", "down", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on this invention.
[0056] In the description of this invention, unless otherwise specified and limited, it should be noted that the terms "installation", "connection" and "linking" should be interpreted broadly. For example, they can refer to mechanical or electrical connections, or internal connections between two components. They can be direct connections or indirect connections through an intermediate medium. Those skilled in the art can understand the specific meaning of the above terms according to the specific circumstances.
[0057] In one embodiment, a smart meter data collection method based on personalized local differential privacy includes:
[0058] The smart meter obtains a privacy budget based on the privacy protection level, which is set according to different needs for the degree of privacy protection.
[0059] The privacy protection level and privacy budget can be dynamically adjusted according to multi-dimensional needs and constraints in actual application scenarios. These include, but are not limited to, users' privacy protection needs, data users' (such as power companies) requirements for data accuracy and availability, technical implementation conditions (such as equipment computing power and communication resources), and specific scenarios for data collection and application (such as real-time monitoring and statistical analysis). By comprehensively considering the above factors, a dynamic balance between privacy protection and data value utilization is achieved, thereby covering a wider range of actual situations and potential needs.
[0060] In this embodiment, the privacy protection levels include high privacy protection level, medium privacy protection level and low privacy protection level, and the privacy budgets corresponding to the high privacy protection level, medium privacy protection level and low privacy protection level are 0.1, 0.5 and 1 respectively.
[0061] The aggregator calculates a privacy budget threshold based on the privacy budget, collects the privacy budget of each meter to obtain the total number of meters and the total privacy budget, and takes the average of the total privacy budget as the privacy budget threshold:
[0062]
[0063] Where E is the sum of all privacy budgets collected by the aggregator; Privacy budget threshold; This indicates that the mean value is calculated.
[0064] The smart meter calculates the probability of being sampled based on a privacy budget threshold and a privacy budget; the formula for calculating the probability of being sampled is as follows:
[0065]
[0066] Where i is the smart meter index. Let be the probability that the i-th smart meter is sampled. Let δ be the privacy budget for the i-th smart meter, and let δ be the privacy budget threshold.
[0067] Whether a smart meter participates in the current round of electricity consumption data collection is determined based on the probability of being sampled. Specifically, the smart meter randomly generates a random value of [0,1]. If the random value is greater than the probability of the smart meter being sampled, the smart meter will not participate in the current round of electricity consumption data collection and will wait for the start of the next round of electricity consumption data collection. If the random value is less than or equal to the probability of the smart meter being sampled, the smart meter will participate in the current round of electricity consumption data collection and perturb the electricity consumption data to obtain perturbed electricity consumption data.
[0068] Specifically, after recording each round of electricity consumption data, the smart meter determines the electricity consumption interval to which the electricity consumption data belongs. The electricity consumption interval includes a lower limit value and an upper limit value. The electricity consumption data is discretized into the lower limit value of the interval with a probability of being close to the lower limit value, and into the upper limit value of the interval with a probability of being close to the upper limit value, thus obtaining a discretization result. The discretization result is perturbed using a K-random response to obtain perturbed electricity consumption data.
[0069] The aggregator calculates the estimated total electricity consumption for this round based on the disturbance electricity data.
[0070] Specifically, the aggregator obtains the maximum power consumption, the total number of smart meters, and determines the number of intervals based on the privacy budget threshold. The aggregator then calculates the interval length and boundary values based on the number of intervals and the maximum power consumption.
[0071] The aggregator counts the frequency of occurrence of boundary values in each interval and the number of sampled smart meters based on the disturbed electricity consumption data. It then uses the frequency of occurrence, privacy budget threshold, and number of sampled smart meters to calculate the true frequency of each boundary value. Based on the total number of meters, the boundary values of each interval, and the true frequency, it estimates the total electricity consumption for this round.
[0072] Current mainstream differential privacy sets the same level of privacy protection for all users. This introduces significant noise for users with lower privacy needs, reducing data usability. Personalized differential privacy, on the other hand, allows users to dynamically adjust their privacy budget based on their privacy preferences or sensitivities, rather than using a uniform standard value. Through personalized design, the algorithm can assign appropriate noise levels to different users or data points (e.g., using a lower ε for highly sensitive data to enhance protection, and a higher ε for general data to improve accuracy), thereby optimizing the practicality of data analysis while achieving strong privacy protection.
[0073] In particular, data perturbation schemes based on rechargeable batteries use the battery as a "physical noise source" to smooth the actual power consumption curve through charging and discharging behavior, masking the start-up and shutdown characteristics of appliances and compromising the identifiability of power consumption trajectories. However, battery hardware costs are high, and the limited capacity cannot fully cover peak power demand, which may result in insufficient privacy protection.
[0074] This invention, based on user-personalized settings, categorizes user privacy protection needs into different levels. Users can select their own privacy protection requirements and allocate corresponding privacy budgets, achieving personalized privacy protection and improving data availability. Specifically, through three levels of privacy budget allocation—high (ε=0.1), medium (ε=0.5), and low (ε=1)—highly sensitive users are subjected to strong noise interference, while ordinary users maintain high data availability. Compared to traditional uniform strong noise interference schemes, overall data availability is significantly improved.
[0075] Based on personalized privacy budget allocation, the router collects and aggregates users' privacy budgets, assigning a threshold. Each user calculates their own probability of being sampled based on this threshold. The lower a user's privacy budget is compared to the threshold, the lower their probability of being sampled, and they upload their disturbed electricity consumption data with that probability. Through the privacy amplification effect of sampling, the privacy protection for users with high privacy needs is strengthened. Furthermore, the sampled electricity meters are disturbed using a uniform threshold, achieving a balance between privacy and data utility. This solution avoids the need for powerful computing capabilities and reliance on rechargeable batteries.
[0076] In one embodiment, a smart meter data collection system based on local differential privacy is provided to implement the above method. The system includes a plurality of smart meters and an aggregator communicatively connected to the plurality of smart meters.
[0077] Example 1: A smart meter data collection system based on local differential privacy operates as follows:
[0078] Users can set their privacy protection level according to their individual needs, choosing from three levels: high, medium, and low. After completing the settings, the smart meter will obtain its own privacy budget based on the privacy protection level. The privacy budgets for the three levels are {High Level: 0.1, Medium Level: 0.5, Low Level: 1}.
[0079] The aggregator collects the privacy budgets of each smart meter and calculates the privacy budget threshold:
[0080]
[0081] Where E is the sum of all privacy budgets collected by the aggregator; Privacy budget threshold; This indicates that the mean value is calculated.
[0082] The aggregator calculates the number of intervals d, requiring that:
[0083] ,
[0084] The interval length is:
[0085] s = m / d (rounded down)
[0086] Where s is the interval length, m is the maximum power consumption (m is a known, typical value; a user's power consumption over a period of time will not exceed m), and the set of interval boundary values is Y = {0, s, 2s, 3s, ..., (d-1)s, m}. The aggregator's gateway broadcasts the privacy budget threshold, maximum power consumption, and number of intervals.
[0087] Smart meters are based on a privacy budget threshold δ and their own privacy budget. Calculate the probability of itself being sampled:
[0088]
[0089] Where i is the smart meter index. Let be the probability that the i-th smart meter is sampled. Let δ be the privacy budget for the i-th smart meter, and let δ be the privacy budget threshold.
[0090] Specifically, the i-th smart meter according to The decision is made regarding whether to participate in this round of electricity data collection. The smart meter randomly generates a value in the range [0,1], compares this random value with the probability of being sampled, and if the random value is greater than... If the random value is less than or equal to the specified value, the smart meter will not participate in this round of electricity consumption data collection; if the random value is less than or equal to the specified value, the smart meter will not participate in this round of electricity consumption data collection. If so, the smart meter participates in this round of electricity consumption data collection. The electricity consumption data collected by the smart meter is... and electricity consumption data Perform a disturbance to obtain disturbance power consumption data. .
[0091] The disturbance process to electricity consumption data is as follows:
[0092] The electricity consumption data recorded by the smart meter for this round is as follows: , It must belong to a power consumption range, let's say the range is (u, v), where u and v belong to Y. Using a discretization function, we can... Discretize to u or v, and denote the discretized value as .
[0093]
[0094] That is to The probability will Discretize to u, that is, with The probability will Discretize to v, where, To use known electricity consumption data Under the premise that the discrete value The probability of taking either u or v.
[0095] Then use the K-random response to... Perform a disturbance:
[0096]
[0097] That is to probability ,by probability r is a randomly selected value from set Y, where Let be the probability distribution function of the disturbance electricity consumption data. To disrupt electricity consumption data.
[0098] The aggregator aggregates and analyzes disturbance power consumption data: each Since it belongs to set Y, the aggregator first counts each element in Y. The frequency in the collected data is denoted as Then estimate each The actual frequency:
[0099]
[0100] in, For each The actual frequency, The first in the set of interval boundary values There are interval boundary values, where n is the number of smart meters being sampled.
[0101] Estimate the total electricity consumption for this round:
[0102]
[0103] Where T represents the total electricity consumption in this round, and N represents the total number of smart meters under the current aggregator management.
[0104] aggregator will The set of estimated true frequencies { , ,,, The total electricity consumption T for this round is sent to the power grid company.
[0105] This invention uses As a perturbation of the privacy budget, for privacy budgets higher than Users using smaller Enhanced privacy protection; for privacy budgets below For our users, we use sampling, leveraging the privacy amplification effect of the sampling mechanism to consistently ensure the level of privacy protection. This allows users to customize their own privacy protection level, making it more personalized, flexible, and applicable to a wider range of situations.
[0106] By customizing their own privacy protection levels, users avoid using a small privacy budget for a group of users with high privacy requirements, reducing noise injection and improving data availability and accuracy.
[0107] Meanwhile, for a single round of data collection, a smart meter only needs to perform a limited number of calculations to complete the task, and can still do so efficiently even with low computing power. The system calculates whether sampling is required based on a privacy budget and a privacy budget threshold, leveraging the privacy amplification effect of the sampling mechanism to enhance privacy protection.
[0108] Furthermore, privacy is protected through personalized differential privacy and k-random response technology, balancing the strength of privacy protection with usability.
[0109] In the description of this specification, the references to terms such as "an embodiment," "some embodiments," "example," "specific example," "a implementation," "a preferred implementation," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0110] Although embodiments of the invention have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims
1. A method for collecting smart meter data based on personalized local differential privacy, characterized in that, include: Smart meters obtain privacy budgets based on privacy protection levels; The aggregator calculates a privacy budget threshold based on the privacy budget. The smart meter calculates the probability of being sampled based on the privacy budget threshold and the privacy budget, and decides whether to participate in the current round of electricity consumption data collection based on the probability of being sampled. When determining whether to participate in the current round of electricity consumption data collection based on the probability of being sampled, the smart meter randomly generates a random value. If the random value is greater than the probability of the smart meter being sampled, the smart meter will not participate in the current round of electricity consumption data collection; if the random value is less than or equal to the probability of the smart meter being sampled, the smart meter will participate in the current round of electricity consumption data collection. If you do not participate, you will have to wait for the next round of electricity data collection. If involved, electricity consumption data is acquired and perturbed to obtain perturbed electricity consumption data; the formula for calculating perturbed electricity consumption data is: In the formula, Let be the probability distribution function of the disturbance electricity consumption data. To disrupt electricity consumption data, The electricity consumption data is discretized. The probability that the perturbation of the electricity consumption data equals the discrete electricity consumption data. The disturbed electricity consumption data is equal to the random value. The probability, It is a value randomly selected from the set of electricity consumption intervals. For privacy budget threshold, The number of intervals; The aggregator estimates the total electricity consumption in this round based on the disturbance electricity consumption data. The aggregator obtains the maximum electricity consumption, the total number of smart meters, and determines the number of intervals based on the privacy budget threshold. The aggregator calculates the interval length and boundary value based on the number of intervals and the maximum electricity consumption. The aggregator statistically analyzes the frequency of occurrence of boundary values for each interval in the electricity consumption data based on the disturbance electricity consumption data, and the number of sampled smart meters. It then uses the frequency of occurrence, a privacy budget threshold, and the number of sampled smart meters to calculate the true frequency of each boundary value. Based on the total number of meters, the boundary values for each interval, and the true frequency, it estimates the total electricity consumption for this round. The formula for calculating the true frequency of the interval boundary values is: In the formula, The true frequency of each interval boundary value, The frequency of each interval boundary value in the collected data. The first in the set of interval boundary values Each interval boundary value, The number of smart meters being sampled; The formula for calculating the total electricity consumption is: In the formula, This represents the total electricity consumption for this round. This represents the total number of smart meters currently managed by the aggregator.
2. The smart meter data collection method based on personalized local differential privacy according to claim 1, characterized in that, The privacy protection levels include high privacy protection level, medium privacy protection level and low privacy protection level, and the privacy budgets corresponding to the high privacy protection level, medium privacy protection level and low privacy protection level are 0.1, 0.5 and 1 respectively.
3. The smart meter data collection method based on personalized local differential privacy according to claim 1, characterized in that, The formula for calculating the privacy budget threshold is as follows: Where E is the sum of all privacy budgets collected by the aggregator; Privacy budget threshold; This indicates that the mean value is calculated.
4. The smart meter data collection method based on personalized local differential privacy according to claim 1, characterized in that, The process of determining whether to participate in this round of electricity consumption data collection based on the probability of being sampled includes: The smart meter randomly generates a comparison value and compares it with the probability of being sampled to obtain a comparison result. Based on the comparison result, it decides whether to participate in this round of electricity consumption data collection.
5. The smart meter data collection method based on personalized local differential privacy according to claim 4, characterized in that, The probability of being sampled is calculated using the following formula: Where i is the smart meter index. Let be the probability that the i-th smart meter is sampled. Let δ be the privacy budget for the i-th smart meter, and let δ be the privacy budget threshold.
6. The smart meter data collection method based on personalized local differential privacy according to claim 1, characterized in that, The disturbance to electricity consumption data includes: After recording each round of electricity consumption data, the smart meter determines the electricity consumption interval to which the data belongs. The electricity consumption interval includes a lower limit and an upper limit. The electricity consumption data is discretized into the lower limit with probability based on how close it is to the lower limit, and into the upper limit with probability based on how close it is to the upper limit, thus obtaining a discretization result. The discretization result is then perturbed using a K-random response to obtain perturbed electricity consumption data.
7. A smart meter data collection system based on local differential privacy, used to implement the method described in any one of claims 1-6, characterized in that, The system includes multiple smart meters and an aggregator that is communicatively connected to the multiple smart meters.