Differential privacy protection method and device for federated learning

By employing adaptive noise addition and hierarchical pruning, the problems of high computational overhead and increased noise in federated learning are solved, achieving efficient differential privacy protection and improved model accuracy, and enhancing user-level privacy protection.

CN115481441BActive Publication Date: 2026-06-23广东省农村信用社联合社 +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
广东省农村信用社联合社
Filing Date
2022-09-23
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing differential privacy technologies in federated learning have huge computational overhead, requiring more computing power for training. Furthermore, as the number of communications between the client and the server increases, the noise in the shared model increases, leading to reduced data availability. Existing solutions have failed to effectively improve this situation.

Method used

An adaptive noise-adding strategy is adopted, which gradually reduces the noise scale as the learning rounds increase. Combined with a hierarchical pruning method, the model is pruned and noise-adding is performed based on the differences in model weights uploaded by the client, thereby reducing the privacy budget and improving model accuracy.

Benefits of technology

It effectively reduces the privacy budget in differential privacy protection, improves model accuracy, reduces communication overhead and attack risks, and enhances user-level privacy protection.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115481441B_ABST
    Figure CN115481441B_ABST
Patent Text Reader

Abstract

The application provides a differential privacy protection method and device for federated learning, which comprises the following steps: obtaining model weight differences uploaded by each client participating in current round learning; performing clipping operation on the model weight differences uploaded by each client according to the clipping parameter corresponding to the current round learning; aggregating each model weight difference after the clipping operation, and performing noise adding processing on the aggregated model weight difference according to the Gaussian noise distribution corresponding to the current round learning to complete model updating of the current round learning; wherein the Gaussian noise distribution corresponding to the current round learning is determined according to the noise scale corresponding to the current round learning and the clipping parameter corresponding to the current round learning, and the noise scale corresponding to each round of learning gradually decreases with the increase of the learning round. The added noise can be fitted to the characteristics of the model weight information uploaded by the current client, thereby obtaining higher model precision and effectively reducing the privacy budget in differential privacy protection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of federated learning technology, and in particular to a differential privacy protection method and apparatus for federated learning. Background Technology

[0002] Federated learning, as a distributed framework, can be used to solve the data silo problem, allowing individuals or organizations to upload only the gradients needed for model training, instead of uploading aggregated, previously private data. Federated learning offers insights for existing deep learning applications such as autonomous driving, natural language processing, and recommendation systems, and has become a new trend in the development of artificial intelligence.

[0003] Ideally, in federated learning, each role is only allowed to obtain the information it needs. However, the participation of each client and every interaction between the client and the server can lead to privacy breaches.

[0004] To achieve privacy protection, differential privacy techniques can be applied in federated learning, such as centralized differential privacy, local differential privacy, distributed differential privacy, and hybrid differential privacy. However, current differential privacy techniques have several unresolved issues: they incur significant computational overhead, requiring more computing power to train noisy models, and the differential algorithms need optimization; as the number of communications between the client and server increases, the overall noise required for the shared model increases, which reduces the usability of the noisy data. Summary of the Invention

[0005] To address the problems existing in the prior art, this invention provides a differential privacy protection method and apparatus for federated learning.

[0006] In a first aspect, the present invention provides a differential privacy protection method for federated learning, comprising:

[0007] Obtain the model weight differences uploaded by each client participating in the current round of learning;

[0008] Based on the pruning parameters corresponding to the current round of learning, pruning operations are performed on the model weight differences uploaded by each client respectively;

[0009] The weight differences of each model after the pruning operation are aggregated, and noise is added to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning to complete the model update of the current round of learning;

[0010] The Gaussian noise distribution corresponding to the current learning round is determined based on the noise scale and the pruning parameter corresponding to the current learning round. The noise scale corresponding to each learning round gradually decreases as the number of learning rounds increases.

[0011] Optionally, the step of performing a pruning operation on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning includes:

[0012] Norm processing is performed on the model weight differences uploaded by each client to obtain the norm value of each model weight difference;

[0013] Based on the median norm of each model weight difference, the model weight differences are divided into multiple sets of model weight differences;

[0014] Based on the median norm value corresponding to each set of model weight differences, determine the pruning parameters for the corresponding set of model weight differences;

[0015] Based on the pruning parameters of each set of model weight differences, a pruning operation is performed on the model weight differences in the corresponding set of model weight differences.

[0016] Optionally, the Gaussian noise distribution corresponding to the current round of learning is determined based on the maximum value of the noise scale and the pruning parameter corresponding to the current round of learning.

[0017] Optionally, the Gaussian noise distribution corresponding to the current round of learning is N(0, z). 2 ·S max 2 ), where z represents the noise scale corresponding to the current round of learning, S max This represents the maximum value among the pruning parameters of the multiple model weight difference sets.

[0018] Optionally, the noise scale corresponding to the current round of learning is determined according to the following formula:

[0019]

[0020] Where z represents the noise scale corresponding to the current learning round, a represents the initial noise level, b represents the degree of change in the noise level added in each round as the learning rounds increase, c represents the rate of decrease in the added noise, and x represents the current learning round.

[0021] Optionally, the method further includes:

[0022] For any target client among the clients, the current contribution of the target client is determined based on the difference in model weights after the pruning operation and the number of times the target client currently participates in federated learning.

[0023] Secondly, the present invention also provides a differential privacy protection device for federated learning, comprising:

[0024] The acquisition module is used to obtain the model weight differences uploaded by each client participating in the current round of learning;

[0025] The pruning module is used to perform pruning operations on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning.

[0026] The aggregation and noise-adding module is used to aggregate the differences in the weights of each model after the pruning operation, and add noise to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning, so as to complete the model update of the current round of learning.

[0027] The Gaussian noise distribution corresponding to the current learning round is determined based on the noise scale and the pruning parameter corresponding to the current learning round. The noise scale corresponding to each learning round gradually decreases as the number of learning rounds increases.

[0028] Thirdly, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the differential privacy protection method for federated learning as described in the first aspect above.

[0029] Fourthly, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the differential privacy protection method for federated learning as described in the first aspect above.

[0030] Fifthly, the present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the differential privacy protection method for federated learning as described above.

[0031] The differential privacy protection method and apparatus for federated learning provided by this invention adaptively adds noise by gradually reducing the noise scale as the learning rounds increase. This fully utilizes the characteristics of the model weight difference changes in each communication between the server and the client, ensuring that the added noise in each communication between the server and the client matches the characteristics of the model weight information uploaded by the client at that time. This results in higher model accuracy and effectively reduces the privacy budget in differential privacy protection. Attached Figure Description

[0032] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0033] Figure 1 This is an architecture diagram of federated learning provided by the present invention;

[0034] Figure 2 This is a flowchart illustrating the differential privacy protection method for federated learning provided by the present invention.

[0035] Figure 3 This is a flowchart illustrating the implementation of the differential privacy protection method for federated learning provided by this invention.

[0036] Figure 4 This is a schematic diagram of the differential privacy protection device for federated learning provided by the present invention.

[0037] Figure 5 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation

[0038] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0039] To facilitate a clearer understanding of the technical solution of this invention, some related technical content will first be introduced.

[0040] (1) Federal learning.

[0041] Figure 1 The architecture diagram of federated learning provided for this invention is as follows: Figure 1 As shown, the federated architecture mainly consists of a parameter server and clients that are considered data holders. Data holders have autonomous control over their terminal devices. Nodes are not stable, and node loads are uneven, such as on a user's smartphone. The parameter server initializes global model parameters. For each iteration, it randomly selects a batch of clients. The selected clients download the initial parameters and train their local data. The clients generate local models and upload the training results, such as the model and weights, to the parameter server, also known as the central server. The parameter server maintains the updates of the global federated model by aggregating the local updates from various clients, or by sharing the updates of the model. Finally, analysts use the federated model for data analysis.

[0042] In stochastic gradient descent in deep learning, a loss function is used to model the model weights, thereby iterating the model weights. Where w t w represents the global model weights after the t-th iteration.t+1 Represents the global model weights after the (t+1)th iteration, and η represents the learning step size. This represents the model gradient value calculated after t iterations. In the algorithm, each client sets a consistent learning step size, and the gradient uploaded by client k in t iterations is... The client-side local computation iteration process is as follows: in This represents the model weight of client k after the t-th iteration. This represents the model weight of client k after the (t+1)th iteration. The weight of client k under federated learning is... Where n k This represents the weight value of client k, and n represents the total weight. The server processes the parameters uploaded by a total of k clients. Perform aggregated updates, where That is, we get:

[0043]

[0044] in The difference in model weights collected by the server from the clients. The gradients aggregated on the server are used to maintain the global model on the server. This explains why federated learning can maintain local datasets on top of deep learning. In summary, federated learning, with the cooperation of a central server, embodies the security principles of centralized data collection and minimization, reducing the privacy risks and costs of traditional machine learning methods. It also allows clients with smaller local datasets to participate in federated learning and enjoy the benefits of the global model.

[0045] (2) Federal learning privacy protection methods.

[0046] Ideally, each role in federated learning should only be allowed access to the information it needs. However, the participation of each client and every interaction between the client and the server can lead to privacy breaches. Based on the knowledge available to each role in the federated architecture, threat models are generally categorized into three types: 1) The first type is untrusted clients. Malicious attackers can gain root access to one or more clients, accessing local data and further obtaining information during intermediate iterations of the model. Some scholars have demonstrated that model theft can be achieved by poisoning the model on one or more clients, causing the text predictor to complete a target sentence attack using words chosen by the attacker. 2) The second type is untrusted servers. Under this threat model, attackers gaining root access to the server can directly access the global model and obtain all update information, thereby tampering with the training process. Alternatively, they can use the server's selection control to choose less trustworthy clients to participate in federated computation, thus disrupting the entire federated learning model training process. The server can even forge and generate a large number of clients to attack the target client, i.e., a Sybil attack. To protect their data, clients often resort to secure multi-party computation and other technologies, or trust a third party that will not collude with the server. 3) The third type is malicious model engineers or analysts who access model iteration sequences from different hyperparameter outputs of multiple systems in an attempt to obtain system design information and other sensitive data.

[0047] The relevant data protection technologies can be broadly categorized into encryption technologies and noise-adding technologies, such as differential privacy. In encryption research, given the vulnerability of the star topology in federated architectures, each client encrypts its data and sends it to the server for homomorphic computation. This typically relies on an external party holding the key and capable of decrypting the computation results to prevent the server from decrypting the contributions of individual clients. However, due to ciphertext attacks, most homomorphic encryption schemes require frequent key updates. The participation of a trusted, non-colluding party is not the only solution; another approach is to rely on distributed encryption schemes where the key is distributed among the parties.

[0048] Regarding noise protection techniques, differential privacy, due to its lightweight nature, is often combined with encryption techniques in solutions related to deep learning and federated learning. In the process of protecting data privacy, to resist differential attacks, differential privacy protection schemes have been proposed. Through rigorous mathematical proof and the application of the idea of ​​random perturbation, these schemes prevent third parties from judging the modification or addition of a single data record based on changes in the output. This is considered one of the most secure methods among perturbation-based privacy protection methods currently available.

[0049] Since both the target perturbation method (adding noise to the objective function) and the output perturbation method (adding noise to the final trained output model) require calculating the upper bound of sensitivity, but in complex algorithms like deep learning and federated learning, the calculation method is currently unknown. Therefore, the only approach is to set pruning parameters to get as close to the upper bound of sensitivity as possible. Thus, the gradient perturbation method (adding noise to the gradient) is often used. This paper introduces the concept of differential privacy into distributed deep learning and proposes a distributed training technique based on selective stochastic gradient descent. Each client does not need to share the original input dataset. Instead, they add noise satisfying the Laplacian or Gaussian noise to their locally calculated gradients to satisfy the differential privacy mechanism. Therefore, each client can maintain a shared neural network model by uploading its local gradients to the server without sharing the input dataset. The server, as a trusted implementer of the differential privacy mechanism, ensures privacy in the output. This forms a distributed model training architecture that protects local data. Clients can apply the shared model to their local inputs without revealing the input and output.

[0050] In a specific type of distributed computing, federated learning, differential privacy is applied in three main ways: 1) Centralized differential privacy: The server acts as a trusted implementer of the differential privacy mechanism to ensure privacy output. Users' devices keep data locally, analyze the raw data, and learn the model. Only model updates are sent to trusted nodes or the server. The server performs operations such as cropping and adding noise to the information uploaded by the client, and aggregates it to update the shared model; 2) Local differential privacy: It is assumed that the user's privacy comes entirely from the randomness added by the user. Each client adds noise to the gradient during the local model iteration process, and its privacy is guaranteed to be independent of the additional randomness contained by all other users. By having each client perform a private transformation on the report before sending it to the central server, the need for a trusted central server is minimized. However, since the magnitude of the introduced random noise must be comparable to the size of the perturbation target in the data, it may be necessary to merge the results among the clients. This is because obtaining local differential privacy with comparable central differential privacy utility requires a relatively large user base to reduce the privacy budget, which can even reach 1 billion in real-world application scenarios; 3) Distributed differential privacy: the client sends data to a secure computation function. The function output is available to the server and meets privacy requirements. Generally, secure computation functions come in various forms, such as multi-party computation protocols or a standard computation in a trusted execution environment.

[0051] In federated learning, hybrid differential privacy can be further developed from the three types of differential privacy mentioned above. Hybrid models allow multiple differential privacy models to coexist; for example, most users contribute data under a local differential privacy mechanism, while a small number of users participate in a central differential privacy mechanism. Based on the extension of the nearest neighbor dataset in the definition of differential privacy, differential privacy can be divided into transaction-level (sample-level) and user-level. Transaction-level privacy protects a single record, but the datasets held by a single client are highly likely to be related. To limit or eliminate the possibility of learning personal information from iterations or the final model, some scholars have proposed using user-level differential privacy during iterative training. In this user-level differential privacy, the nearest neighbor dataset differs from all the data held by a single user, making it a stronger concept than the transaction-level nearest neighbor dataset.

[0052] Currently, the following problems with differential privacy technology have not been effectively improved: differential privacy technology has huge computational overhead, training the noisy model requires more computing power, and the differential algorithm needs to be optimized; as the number of communications between the client and the server increases, the overall noise required for the shared model increases, which will lead to a decrease in the availability of the data after adding noise.

[0053] To address the aforementioned issues, this invention provides a solution that protects user-level privacy—specifically, the data across the entire client—based on central differential privacy, thereby reducing the client's exposure risk under differential inference attacks. Building upon this, user-level privacy protection is achieved through a hierarchical pruning method using two medians and an adaptive noise-adding strategy. Furthermore, a contribution evaluation method based on the two-median pruning method resolves the communication overhead between clients caused by local differential methods in previous solutions, reducing the privacy budget in differential privacy. Simultaneously, it addresses to some extent the user-level privacy protection issues faced by local clients in small-scale federated learning scenarios and the problem of insufficient client-held data leading to unusable local model accuracy.

[0054] Figure 2 This is a flowchart illustrating the differential privacy protection method for federated learning provided by the present invention, as shown below. Figure 2 As shown, the method includes the following steps:

[0055] Step 200: Obtain the model weight difference uploaded by each client participating in the current round of learning.

[0056] Step 201: Based on the pruning parameters corresponding to the current round of learning, perform pruning operations on the model weight differences uploaded by each client.

[0057] Step 202: Aggregate the differences in the weights of each model after the pruning operation, and add noise to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current learning round to complete the model update for the current learning round; wherein, the Gaussian noise distribution corresponding to the current learning round is determined according to the noise scale and the pruning parameters corresponding to the current learning round, and the noise scale corresponding to each learning round gradually decreases as the learning round increases.

[0058] Specifically, the entity executing this method can be a parameter server in a federated architecture, hereinafter referred to as the server.

[0059] The most significant challenge in existing differential privacy protection methods for federated learning is minimizing the privacy budget while maintaining the availability of the shared model. Many existing solutions primarily protect individual records on the client side. However, in federated learning scenarios, the datasets held by each client are often highly correlated, and in reality, the collected data are independently and identically distributed, making it impractical to protect only individual records or queries. Furthermore, local differential privacy techniques for protecting individual records or queries on the client side may result in inconsistent pruning and noise addition across clients, leading to wasted privacy budgets. Communication between clients also increases communication overhead and attack risks. To better reflect real-world applications of federated models, this invention protects user-level privacy based on central differential privacy. During federated learning, for each learning epoch, the server uses the differential privacy protection method provided in this invention to iteratively update the shared model.

[0060] Taking a certain round of learning (i.e., a certain round of model iteration update) as an example, the server initializes the global model parameters, selects a batch of clients for the current round of learning, and the selected clients will download the initial parameters to train on local data. The clients generate local models and upload the training results to the server. Thus, the server can obtain the model weight difference uploaded by each client participating in the current round of learning, that is, the model gradient.

[0061] Then, the server can perform operations such as pruning, aggregation, and adding noise on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning, so as to complete the model update for the current round of learning.

[0062] To reduce the privacy budget in user-level privacy protection, this invention proposes an adaptive noise allocation strategy. Generally, as the difference in uploaded parameters (model weight difference) decreases, the actual noise scale that can be accommodated also decreases, thus requiring a larger privacy budget. If the noise scale is uniformly applied in each model iteration, it will obviously add a lot of unnecessary noise, demonstrating that adaptive noise allocation is superior to fixed noise allocation. The adaptive budget allocation resulting from adaptive noise allocation will more accurately measure the relationship between privacy budget and model accuracy. If the noise scale of the server in each iteration is close to the amount of noise that the model weights can accommodate, it can effectively reduce the unnecessary privacy budget caused by excessive noise.

[0063] The differential privacy protection method for federated learning provided by this invention adaptively adds noise by gradually reducing the noise scale as the learning rounds increase. This fully utilizes the characteristics of the model weight difference changes in each communication between the server and the client, ensuring that the added noise in each communication between the server and the client matches the characteristics of the model weight information uploaded by the client at that time. This results in higher model accuracy and effectively reduces the privacy budget in differential privacy protection.

[0064] Optionally, based on the pruning parameters corresponding to the current round of learning, pruning operations are performed on the model weight differences uploaded by each client, including:

[0065] Norm processing is performed on the model weight differences uploaded by each client to obtain the norm value of each model weight difference;

[0066] Based on the median norm of the weight differences of each model, the weight differences of each model are divided into multiple sets of model weight differences;

[0067] The pruning parameters for the corresponding model weight difference set are determined based on the median norm value of each model weight difference set.

[0068] Based on the pruning parameters of each model weight difference set, perform a pruning operation on the model weight differences in the corresponding model weight difference set.

[0069] Specifically, in this embodiment of the invention, the server can adopt a layered pruning approach when performing the pruning operation. The following explanation uses a layered pruning with two medians (i.e., dividing the model weight differences into two sets based on the median norm of the model weight differences) as an example. Layered pruning with multiple medians follows the same principle and will not be elaborated further.

[0070] First, the server can perform norm processing on the model weight differences of each client, such as Euclidean norm (also called 2-norm), to obtain the norm value of the weight differences of each model.

[0071] Then, the server can determine the median of the norm values ​​of the differences between the model weights. For example, if there are 5 clients and the norm values ​​of the differences between the model weights of the 5 clients are 1, 2, 3, 4 and 5 respectively, then the median is 3.

[0072] Then, the server can divide the model weight differences into two sets based on the median of the norm values ​​of the determined model weight differences. Taking the five clients as an example again, if the median of the norm values ​​of the model weight differences for the five clients is 3, then the three model weight differences with a norm value less than or equal to 3 can be grouped into one set, and the two model weight differences with a norm value greater than 3 can be grouped into another set. Of course, this is just an example; other methods can be used for division, and the specific situation is not limited.

[0073] For the two sets of model weight differences, the clipping parameters of the corresponding sets can be determined based on the median norm of each set, and the clipping operation can be performed using the corresponding clipping parameters.

[0074] Taking the five clients mentioned above as examples, the norms of the first set of model weight differences are 1, 2, and 3, with a median of 2. Therefore, the median of 2 can be used as the clipping parameter for each model weight difference in this set. The norms of the second set of model weight differences are 4 and 5, with a median of 4.5. Therefore, the median of 4.5 can be used as the clipping parameter for each model weight difference in this set. Thus, in this example, there are two clipping parameters corresponding to the current learning round.

[0075] Optionally, the cropping action that performs the cropping operation can be... Where Δw represents the difference in model weights uploaded by the client, Let S represent the pruning parameter, and ||Δw|| represent the norm of the model weight difference. In this embodiment of the invention, based on a pruning method using two medians, when pruning the model weight differences in the first set of partitioned model weight differences, the pruning parameter S uses the pruning parameter of the first set of partitioned model weight differences. When pruning the model weight differences in the second set of partitioned model weight differences, the pruning parameter S uses the pruning parameter of the second set of partitioned model weight differences, thereby achieving hierarchical pruning of the model weight differences uploaded by the clients. Since the clients participating in each iteration are not all the same, and the model weight differences uploaded by each client will change according to the number of times they participate in federated computation, the hierarchical pruning method can adaptively select the reduction parameter according to the model weight difference of the client in the current round, thereby preserving the characteristics and contributions of each client with a non-independent and identically distributed dataset to the greatest extent, while also enhancing the fairness among clients.

[0076] Optionally, the Gaussian noise distribution corresponding to the current round of learning is determined based on the maximum value of the noise scale and the pruning parameter corresponding to the current round of learning.

[0077] Specifically, in this embodiment of the invention, when there are multiple clipping parameters corresponding to the current round of learning based on the layered clipping method, the server can determine the Gaussian noise distribution corresponding to the current round of learning based on the noise scale corresponding to the current round of learning and the maximum value among the clipping parameters corresponding to the current round of learning.

[0078] Optionally, the Gaussian noise distribution corresponding to the current round of learning can be represented as N(0, z). 2 ·S max 2 ), where z represents the noise scale corresponding to the current round of learning, S max This represents the maximum value among the clipping parameters of multiple model weight difference sets.

[0079] Alternatively, the noise scale corresponding to the current learning round can be determined according to the following formula:

[0080]

[0081] Where z represents the noise scale corresponding to the current learning round, a represents the initial noise level, b represents the degree of change in the noise level added in each round as the learning rounds increase, c represents the rate of decrease in noise addition, and x represents the current learning round. x is the independent variable, z is the dependent variable, and a, b, and c are parameters. The values ​​of a, b, and c can be flexibly set according to the actual model training situation, and no specific restrictions are imposed here.

[0082] The following examples illustrate specific application scenarios.

[0083] This embodiment employs a federated architecture with a highly trusted central server responsible for sharing the model, and multiple clients participating in federated learning. Each client acts as an independent data holder, not interconnected with others, thus eliminating communication overhead. The server initializes model parameters, determines noise parameters and other hyperparameters, and selects clients to participate in this round of collaborative training based on a certain probability, distributing the model parameters and corresponding hyperparameters to the selected clients. Subsequently, the server collects the parameters and performs a hierarchical pruning method based on two medians, aggregating the pruning results and adding appropriate Gaussian noise to initiate a new round of shared model iteration.

[0084] Selected clients use stochastic gradient descent to train a noiseless model, reducing the computational burden on the terminal device. Multiple training epochs are called a training cycle. Each client randomly selects training samples as a training batch and calculates the gradient using the loss function, thereby calculating the corresponding model weight difference and uploading it directly to the server. After each global model iteration by the server, clients are selected again. Only the selected clients participating in this round of communication can download the model parameters. In each round of communication, the selected clients upload the results after one or more training cycles, avoiding the communication overhead and risks caused by multiple transmissions.

[0085] Figure 3 The following is a flowchart illustrating the implementation of the differential privacy protection method for federated learning provided by this invention. Figure 3 As shown, it mainly consists of 6 steps, as detailed below.

[0086] ① The server initializes the shared model and determines the basic parameters of the model.

[0087] The server initializes the parameters of the shared model, determining the noise scale in each iteration of the shared model by defining the parameters in the noise-adding function. The noise-adding function has the following form: Where a, b, and c are parameters, a represents the initial noise level, b represents the change in noise level in each round as the number of communications increases, and c represents the rate at which noise is reduced. The independent variable x is the current round number, and the dependent variable is the current noise scale. The dependent variable f(x) decreases as the independent variable increases. As the difference in uploaded parameters decreases, the actual noise scale that can be accommodated also decreases, thus requiring a larger privacy budget. If the noise scale is uniform in each round of model iteration, it will obviously add a lot of unnecessary noise, indicating that adaptive noise addition is better than fixed noise addition. The adaptive budget allocation brought about by adaptive noise addition will more accurately measure the relationship between privacy budget and model accuracy. If the noise scale of the server in each round of iteration is close to the amount of noise that the model weights can accommodate, it can effectively reduce the unnecessary privacy budget caused by excessive noise addition.

[0088] ②The server selects clients to participate in this round of collaborative learning based on a certain probability.

[0089] Participating clients have a model structure consistent with the shared model.

[0090] ③ The selected client downloads the model parameters, trains the model locally, and obtains the model weight results.

[0091] ④ After the client completes multiple batches and rounds of training on the local model, it uploads the calculation results in a unified manner.

[0092] ⑤ The server updates the clipping value based on the clipping method of two medians for the results uploaded by the client, and performs layered clipping.

[0093] The server, based on the client model, sequentially applies Euclidean norm processing to the weight differences of each layer in the results uploaded by each client, and then takes the median after 2-norm processing. Since the selection of clients in federated learning is random, the median obtained in each round will not be consistent. As the number of rounds of communication between the client and the server increases, the weight differences in the model iterations become smaller, and the gradient also decreases with the number of communication rounds under the premise of a unified learning step size. Therefore, adaptive pruning can be performed based on its characteristics. After the server completes the selection of clients in each round, it divides the clients into two sets based on the median of the current parameter difference norm processed by the client's uploaded data. Within each set, the median of the corresponding norm is selected as the two pruning parameters for this round, such as s1 and s2, thus performing layered reduction of the values ​​uploaded by the clients. Since the clients participating in each iteration are not all the same, and the parameter differences uploaded by each client change depending on the number of times they participate in federated computation, the median pruning method can adaptively select pruning parameters based on the parameter differences of the client in the current round, thereby preserving the characteristics and contributions of each client with a non-independent and identically distributed dataset to the greatest extent, while also enhancing fairness among clients.

[0094] Based on the definition of differential privacy, it can be concluded that the overall sensitivity of clients participating in federated computing to the differential privacy mechanism is always the median that is the largest. Therefore, the Gaussian noise distribution is N(0, z). 2 ·max(s1,s2) 2 ), where z represents the custom noise scale in the model parameters, The larger the median is, the larger the sensitivity clipping parameter will be. Since sensitivity is positively correlated with the variance in the noise distribution, an excessively large sensitivity clipping threshold will seriously affect the model performance.

[0095] ⑥ The server aggregates and adds noise to the cropped data from each client, completing the update of this round of shared model, and then starts from ① again.

[0096] Optionally, the method further includes:

[0097] For any target client among all clients, the current contribution of the target client is determined based on the difference in model weights after the pruning operation and the number of times the target client currently participates in federated learning.

[0098] Specifically, differential privacy in current federated learning relies on a large number of client participants. On a small scale, the privacy performance of federated learning algorithms fails to meet real-world expectations. This is because planned client deployments are not always present in the widespread application of federated learning; in reality, the number of clients participating in federated learning is relatively small. Consequently, more noise is required to achieve differential privacy, resulting in a less-than-ideal privacy performance.

[0099] In federated learning, users possess terminal devices and can decide whether to participate in federated computation. The more clients participating in federated learning, the better the impact of noise accumulation can be offset. To promote client participation and sustainability within the computational environment, and to share a more accurate model through their respective data and computing power, this embodiment of the invention leverages the high reliability of the server in the model definition to calculate the update scale of each client, thereby evaluating their contribution. Simultaneously, by recording each client's contribution, participating clients can be compensated, such as given priority in using the client's federated model, or a higher reputation leading to higher ranking and inclusion in other federated learning applications, thus enhancing the motivation and sustainability of participating clients. Furthermore, this contribution evaluation method can be applied in different scenarios. For example, using a third party for synchronous recording helps verify the server's credibility. If the third party's contribution records are abnormal, it can be inferred whether the server might be controlling or forging a large number of clients to guess the model, such as controlling the normally random client selection phase. On the server, this embodiment of the invention processes the difference in model weights uploaded and pruned by participating clients as the update scale contribution from that client, while also recording the number of times a client is selected by the server for local computation.

[0100] Let the difference in model weights after client k uploads and prunes be represented as Δw. k For Δw k ∈R p×q The contribution of client k is defined as follows:

[0101]

[0102] Among them, C k Indicates the contribution of client k. Represents the second-order tensor Δw k The elements in the i-th row and j-th column of the array, p and q, represent the second-order tensors Δw. k The number of rows and columns.

[0103] The total contribution from all clients is defined as follows:

[0104]

[0105] Contribution of client k to the shared model (CL) k Defined as Where h k This indicates the number of local computations that client k was selected by the server to participate in the shared model iteration.

[0106] In one implementation, the server or a third party can record client model parameter information using a pruning method based on two medians, establish a corresponding contribution mechanism, and cut off the possibility of multiple clients communicating and connecting on the basis of server trust. This realizes a federated learning incentive mechanism and method that satisfies differential privacy protection, thereby enhancing the fairness and enthusiasm among clients jointly building a shared model in federated learning, promoting more clients to participate in federated learning, and enhancing the privacy protection effect of federated learning to a certain extent.

[0107] The differential privacy protection device for federated learning provided by the present invention will be described below. The differential privacy protection device for federated learning described below can be referred to in correspondence with the differential privacy protection method for federated learning described above.

[0108] Figure 4 This is a schematic diagram of the differential privacy protection device for federated learning provided by the present invention, as shown below. Figure 4 As shown, the device includes:

[0109] Module 400 is used to obtain the model weight differences uploaded by each client participating in the current round of learning;

[0110] The pruning module 410 is used to perform pruning operations on the weight differences of the models uploaded by each client according to the pruning parameters corresponding to the current round of learning.

[0111] The aggregation and noise-adding module 420 is used to aggregate the differences between the weights of each model after the pruning operation, and add noise to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning, so as to complete the model update of the current round of learning.

[0112] The Gaussian noise distribution corresponding to the current learning round is determined based on the noise scale and the pruning parameter corresponding to the current learning round. The noise scale corresponding to each learning round gradually decreases as the number of learning rounds increases.

[0113] Optionally, based on the pruning parameters corresponding to the current round of learning, pruning operations are performed on the model weight differences uploaded by each client, including:

[0114] Norm processing is performed on the model weight differences uploaded by each client to obtain the norm value of each model weight difference;

[0115] Based on the median norm of the weight differences of each model, the weight differences of each model are divided into multiple sets of model weight differences;

[0116] The pruning parameters for the corresponding model weight difference set are determined based on the median norm value of each model weight difference set.

[0117] Based on the pruning parameters of each model weight difference set, perform a pruning operation on the model weight differences in the corresponding model weight difference set.

[0118] Optionally, the Gaussian noise distribution corresponding to the current round of learning is determined based on the maximum value of the noise scale and the pruning parameter corresponding to the current round of learning.

[0119] Optionally, the Gaussian noise distribution corresponding to the current round of learning is N(0, z). 2 ·S max 2 ), where z represents the noise scale corresponding to the current round of learning, S max This represents the maximum value among the pruning parameters of the multiple model weight difference sets.

[0120] Optionally, the noise scale corresponding to the current learning round is determined according to the following formula:

[0121]

[0122] Where z represents the noise scale corresponding to the current learning round, a represents the initial noise level, b represents the degree of change in the noise level added in each round as the learning rounds increase, c represents the rate of decrease in the added noise, and x represents the current learning round.

[0123] Optionally, the device further includes:

[0124] The contribution determination module is used to determine the current contribution of any target client among all clients, based on the difference in model weights after the pruning operation and the number of times the target client has participated in federated learning.

[0125] It should be noted that the device provided by the present invention can implement all the method steps implemented in the above method embodiments and can achieve the same technical effect. Therefore, the parts and beneficial effects that are the same as those in the method embodiments will not be described in detail here.

[0126] Figure 5 This is a schematic diagram of the structure of the electronic device provided by the present invention, such as... Figure 5As shown, the electronic device may include a processor 510, a communication interface 520, a memory 530, and a communication bus 540, wherein the processor 510, the communication interface 520, and the memory 530 communicate with each other through the communication bus 540. The processor 510 can call the logical instructions in the memory 530 to execute any of the differential privacy protection methods for federated learning provided in the above embodiments, such as: obtaining the model weight differences uploaded by each client participating in the current round of learning; performing pruning operations on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning; aggregating the model weight differences after the pruning operations, and adding noise to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning to complete the model update for the current round of learning; wherein the Gaussian noise distribution corresponding to the current round of learning is determined according to the noise scale and the pruning parameters corresponding to the current round of learning, and the noise scale corresponding to each round of learning gradually decreases with the increase of the learning round.

[0127] Furthermore, the logical instructions in the aforementioned memory 530 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0128] On the other hand, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, the computer is able to execute any of the differential privacy protection methods for federated learning provided in the above embodiments.

[0129] It should be noted that the computer program product provided by the present invention can implement all the method steps implemented in the above method embodiments and can achieve the same technical effect. Therefore, the parts and beneficial effects that are the same as those in the method embodiments will not be described in detail here.

[0130] In another aspect, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, is implemented to perform any of the differential privacy protection methods for federated learning provided in the above embodiments.

[0131] It should be noted that the non-transitory computer-readable storage medium provided by the present invention can implement all the method steps implemented in the above method embodiments and can achieve the same technical effect. Here, the parts that are the same as those in the method embodiments and the beneficial effects will not be described in detail.

[0132] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0133] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.

[0134] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A differential privacy protection method for federated learning, characterized in that, include: Obtain the model weight differences uploaded by each client participating in the current round of learning; Based on the pruning parameters corresponding to the current round of learning, pruning operations are performed on the model weight differences uploaded by each client respectively; The weight differences of each model after the pruning operation are aggregated, and noise is added to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning to complete the model update of the current round of learning; The Gaussian noise distribution corresponding to the current learning round is determined based on the noise scale and the pruning parameter corresponding to the current learning round. The noise scale corresponding to each learning round gradually decreases as the number of learning rounds increases. The step of performing a pruning operation on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning includes: Norm processing is performed on the model weight differences uploaded by each client to obtain the norm value of each model weight difference; Based on the median norm of each model weight difference, the model weight differences are divided into multiple sets of model weight differences; Based on the median norm value corresponding to each set of model weight differences, determine the pruning parameters for the corresponding set of model weight differences; Based on the pruning parameters of each set of model weight differences, a pruning operation is performed on the model weight differences in the corresponding set of model weight differences.

2. The differential privacy protection method for federated learning according to claim 1, characterized in that, The Gaussian noise distribution corresponding to the current round of learning is determined based on the maximum value of the noise scale and the clipping parameter corresponding to the current round of learning.

3. The differential privacy protection method for federated learning according to claim 2, characterized in that, The Gaussian noise distribution corresponding to the current round of learning is: Where z represents the noise scale corresponding to the current round of learning, This represents the maximum value among the pruning parameters of the multiple model weight difference sets.

4. The differential privacy protection method for federated learning according to any one of claims 1 to 3, characterized in that, The noise scale corresponding to the current round of learning is determined according to the following formula: ; Where z represents the noise scale corresponding to the current learning round, a represents the initial noise level, b represents the degree of change in the noise level added in each round as the learning rounds increase, c represents the rate of decrease in the added noise, and x represents the current learning round.

5. The differential privacy protection method for federated learning according to claim 1, characterized in that, The method further includes: For any target client among the clients, the current contribution of the target client is determined based on the difference in model weights after the pruning operation and the number of times the target client currently participates in federated learning.

6. A differential privacy protection device for federated learning, characterized in that, include: The acquisition module is used to obtain the model weight differences uploaded by each client participating in the current round of learning; The pruning module is used to perform pruning operations on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning. The aggregation and noise-adding module is used to aggregate the differences in the weights of each model after the pruning operation, and add noise to the aggregated model weight differences according to the Gaussian noise distribution corresponding to the current round of learning, so as to complete the model update of the current round of learning. The Gaussian noise distribution corresponding to the current learning round is determined based on the noise scale and the pruning parameter corresponding to the current learning round. The noise scale corresponding to each learning round gradually decreases as the number of learning rounds increases. The step of performing a pruning operation on the model weight differences uploaded by each client according to the pruning parameters corresponding to the current round of learning includes: Norm processing is performed on the model weight differences uploaded by each client to obtain the norm value of each model weight difference; Based on the median norm of each model weight difference, the model weight differences are divided into multiple sets of model weight differences; Based on the median norm value corresponding to each set of model weight differences, determine the pruning parameters for the corresponding set of model weight differences; Based on the pruning parameters of each set of model weight differences, a pruning operation is performed on the model weight differences in the corresponding set of model weight differences.

7. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the differential privacy protection method for federated learning as described in any one of claims 1 to 5.

8. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the differential privacy protection method for federated learning as described in any one of claims 1 to 5.

9. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the differential privacy protection method for federated learning as described in any one of claims 1 to 5.