A publicly auditable byzantine robust federated learning system and method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By introducing a third-party audit server and using random sampling to assess client trustworthiness, this approach addresses the issues of poor defense against malicious clients and high computational resource consumption in existing federated learning methods, achieving efficient Byzantine robustness and public auditing.

CN116151369BActive Publication Date: 2026-06-19WUHAN UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: WUHAN UNIV
Filing Date: 2022-11-23
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Existing Byzantine robust federated learning methods have limited defensive effectiveness against most malicious clients and require high computational resources from aggregation servers, making them unsuitable for public auditing.

Method used

A third-party auditing server is introduced to audit the federated learning training process. The client's credibility evaluation value and normalization factor are updated by random sampling, reducing the burden on the aggregation server. The global model parameters are calculated by weighted average.

Benefits of technology

It can quickly train a high-precision global model when most clients are malicious, reducing the computational overhead of Byzantine attack detection tasks on the aggregation server and supporting public auditing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116151369B_ABST

Patent Text Reader

Abstract

This invention proposes a publicly audited Byzantine robust federated learning system and method. The invention introduces a third-party audit server to audit the federated learning training process. First, the aggregation server sends initial model parameters to each federated learning client and the audit server. Second, the federated learning client trains its model on its local dataset and uploads the local model update to the aggregation server. Then, the third-party audit server trains its model to obtain the audit server's model update and uses a random sampling method combined with cosine similarity to calculate the similarity between each client and the audit server. Finally, the aggregation server uses the similarity as a weight for each client and obtains the global model update through weighted averaging. This invention continues to execute the above process until the maximum number of training iterations is reached. This invention can still train an accurate model even when Byzantine clients are involved.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of artificial intelligence, and in particular relates to a Byzantine robust federated learning system and method for public auditing. Background Technology

[0002] With the development of network technology, the internet is generating an ever-increasing amount of data. How to better utilize this data through data mining techniques has become an urgent research topic. Traditionally, we first collect the data and then centrally train it on a server. However, this method has significant drawbacks. Data owners may be unwilling to share their data, or users may be unable to collect data in a timely and effective manner due to network bandwidth limitations. For example, in smart healthcare, due to the sensitivity of patient data, hospitals cannot directly share data with third parties.

[0003] To alleviate these problems, Google proposed the concept of federated learning. Its main idea is that, with the help of an aggregation server, data owners collaboratively train a model locally without sharing the original data. Specifically, in federated learning, there are multiple clients—the data owners—and one aggregation server—the service provider. The clients possess their local training datasets, and the service provider allows the clients to jointly train a single model, the global model. Due to its potential applications, many companies use it to develop real-world applications. For example, Google proposed a federated learning method for Android keyboard word prediction.

[0004] However, recent research indicates that federated learning is vulnerable to Byzantine attacks. For example, a malicious client can compromise the global model by poisoning its local training dataset or sending fake model updates. A compromised global model can make incorrect predictions, even predicting the adversary's chosen target label. Alternatively, a client might only want to obtain the global model through federated learning training, without wanting or being able to provide its local model. Due to these issues, many Byzantine robust methods have been proposed. These methods mainly fall into two categories. The first category utilizes statistical knowledge to compare and analyze model updates uploaded by each client, excluding anomalous updates before updating the global model. The essence of these solutions is that they only work under specific threat assumptions. When the adversary does not meet these assumptions, the defense may fail. The second approach assumes that the server maintains a partially clean dataset and then uses this dataset as the basis for identifying anomalous updates and excluding poorly performing updates. These solutions require an aggregation server to maintain a clean dataset and train the model on it.

[0005] Although Byzantine robust federated learning has been extensively studied, it still faces the following challenges. First, most methods achieve Byzantine robustness by comparing model updates and removing anomalies. This approach has limited defensive effectiveness and fails to work effectively when most clients are malicious. Second, most existing methods do not support public auditing and typically rely on aggregation rules from an aggregation server to achieve Byzantine robustness. These solutions place a significant computational burden on the aggregation server and are unsuitable for situations where the aggregation server's computational resources are limited. Furthermore, many aggregation servers do not wish to focus excessively on defending against malicious clients but rather on training an accurate global model. Providing a third-party Byzantine robustness service would be extremely valuable. This is the focus of this invention. Summary of the Invention

[0006] To address the aforementioned technical problems, this invention proposes a Byzantine robust federated learning system and method for public auditing.

[0007] The technical solution of this invention is a publicly auditable Byzantine robust federated learning system, comprising:

[0008] Multiple federated learning clients, third-party audit servers, and aggregation servers;

[0009] The aggregation server is sequentially connected to multiple federated learning clients; the aggregation server is also connected to the third-party audit server.

[0010] This process involves constructing initial local federated learning models for each federated learning client and the third-party audit server in this iteration. The federated learning models are trained using the federated learning gradient algorithm to obtain the parameters of each client's trained federated learning model, which are then uploaded to the aggregation server and the third-party audit server. The federated learning model is then trained again using the federated learning gradient algorithm to obtain the updated parameter vector for the third-party audit server's trained federated learning model. The credibility evaluation value and normalization factor of each client are updated using random sampling and transmitted to the aggregation server. The aggregation server calculates the global federated learning model parameter vector using a weighted average and uses it as the initial parameter vector for the next iteration. Iterative optimization yields the iteratively optimized federated learning model parameter vectors for each client, the third-party audit server, and the aggregation server. Finally, the corresponding iteratively optimized federated learning model is constructed based on these parameter vectors.

[0011] The technical solution of this invention is a Byzantine robust federated learning method for public auditing, and the specific steps are as follows:

[0012] Step 1: The aggregation server sequentially sends the initialization parameter vector of the federated learning model in this iteration to each federated learning client and the third-party audit server. Each federated learning client constructs its own initial local federated learning model in this iteration based on the initialization parameter vector of the federated learning model in this iteration. The third-party audit server constructs its own initial local federated learning model in this iteration based on the initialization parameter vector of the federated learning model in this iteration.

[0013] Step 2: Each federated learning client uses the local dataset as the training set and combines the federated learning gradient algorithm to train the federated learning model, obtaining the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization, i.e., the gradient, and uploads it to the aggregation server. The aggregation server transmits the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization to the third-party audit server.

[0014] Step 3: The third-party audit server constructs a training set using multiple uncontaminated samples, and trains the federated learning model using the federated learning gradient algorithm to obtain the parameter update vector of the federated learning model after training in this iteration. The credibility evaluation value and normalization factor of each federated learning client in this iteration are updated according to the random sampling method and transmitted to the aggregation server.

[0015] Step 4: The aggregation server combines the updated federated learning model after training, the credibility evaluation value, and the normalization factor of each federated learning client in this iteration to calculate the global model parameter vector of the federated learning in this iteration through weighted average. The aggregation server then uses the global model parameter vector of the federated learning as the initialization parameter vector of the federated learning model in the next iteration.

[0016] Step 5: Iterative optimization. Execute steps 1 to 4 until the maximum number of iterations is reached, and obtain the iteratively optimized federated learning model parameter vector for each federated learning client, the iteratively optimized federated learning model parameter vector for the third-party audit server, and the iteratively optimized global federated learning model parameter vector for the aggregation server.

[0017] Step 6: Each federated learning client constructs the corresponding iteratively optimized federated learning model based on the parameter vector of the iteratively optimized federated learning model.

[0018] Preferably, the local dataset in step 2 consists of multiple contaminated samples and multiple uncontaminated samples;

[0019] Preferably, step 3 updates the credibility evaluation value and normalization factor of each federated learning client in this iteration of optimization using a random sampling method, as follows:

[0020] If the current iteration of optimization is not selected by random sampling, the credibility evaluation value and normalization factor of each federated learning client in the current iteration of optimization are updated by using the credibility evaluation value and normalization factor of each federated learning client in the previous iteration of optimization.

[0021] If the current iteration of optimization is selected by random sampling, the credibility evaluation value and normalization factor of each client are calculated by combining the post-training model parameter update vector of the third-party audit server with the post-training federated learning model parameter update vector of each federated learning client.

[0022] The credibility evaluation value and normalization factor of each federated learning client in this iteration optimization are calculated using the cosine similarity calculation model.

[0023] The parameter vector weights of each federated learning client in this iteration of optimization are calculated using a cosine similarity calculation model, as follows:

[0024]

[0025] i∈[1,n], r∈[1,R]

[0026] in, Indicates the first r Round of iteration i The cosine of a federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations express l 2-norm computation, where R represents the maximum number of iterations. n The number of federated learning clients;

[0027] The credibility assessment value for each federated learning client is calculated as follows:

[0028]

[0029]

[0030] Where RELU(*) represents the clipping calculation, and x represents the variable used in the clipping calculation. Indicates the first r Round of iteration i The credibility evaluation value of each federated learning client;

[0031] The normalization factor for each federated learning client is calculated as follows:

[0032]

[0033] in, Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations Indicates the first r Round of iteration i Normalization factor of each federated learning client;

[0034] Will , Send to the aggregation server;

[0035] Preferably, step 4 involves calculating the global model parameter vector of federated learning in this iteration of optimization using a weighted average, as detailed below:

[0036]

[0037] in, n For the number of Federated Learning training clients, Indicates the first r Round of iteration i The credibility assessment value of each federated learning client. Indicates the first r Round of iteration i Normalization factor of each federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. For the first r The global model parameter vector of the federated learning in the first iteration of optimization will be used as the initialization parameter vector of the federated learning model in the next iteration of optimization, i.e., the first... r +1 Initialization parameter vector of the federated learning model in iterative optimization of the federated learning model;

[0038] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0039] Unlike other federated learning methods, this invention reduces the burden on the aggregation server for Byzantine attack detection tasks by introducing a third-party auditing server to audit the federated learning training process.

[0040] The efficient Byzantine robust federated learning method proposed in this invention can quickly train a high-precision global model even when most clients are malicious.

[0041] Unlike other federated learning methods, the sampling auditing method proposed in this invention can greatly alleviate the problem of excessive overhead in existing Byzantine robust methods. Attached Figure Description

[0042] Figure 1 : A schematic diagram of the method flow of an embodiment of the present invention. Detailed Implementation

[0043] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0044] In specific implementation, the method proposed in the technical solution of this invention can be automatically executed by those skilled in the art using computer software technology. System devices for implementing the method, such as computer-readable storage media storing the corresponding computer program of the technical solution of this invention and computer equipment including the computer program running the corresponding computer program, should also be within the protection scope of this invention.

[0045] The technical solution of the system in this embodiment of the invention is a publicly auditable Byzantine robust federated learning system, comprising:

[0046] Multiple federated learning clients, third-party audit servers, and aggregation servers;

[0047] The aggregation server is connected sequentially to multiple federated learning clients; the aggregation server is also connected to the third-party audit server.

[0048] All of the federated learning clients selected are computer terminals;

[0049] Both the third-party audit server and the aggregation server selected are IBM X3850 X5 servers.

[0050] The number of federated learning clients is 20.

[0051] The following is combined Figure 1This invention introduces a Byzantine robust federated learning method for public auditing, as provided in an embodiment of the present invention, as follows:

[0052] The application scenario of this invention is that each federated learning client receives an access request, and the access request is classified by a BP neural network to determine whether there is an attack in the access request.

[0053] Step 1: The aggregation server sequentially sends the initialization parameter vector of the federated learning model in this iteration to each federated learning client and the third-party audit server. Each federated learning client constructs its own initial local federated learning model in this iteration based on the initialization parameter vector of the federated learning model in this iteration. The third-party audit server constructs its own initial local federated learning model in this iteration based on the initialization parameter vector of the federated learning model in this iteration.

[0054] Step 2: Each federated learning client uses the local dataset as the training set and combines the federated learning gradient algorithm to train the federated learning model, obtaining the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization, i.e., the gradient, and uploads it to the aggregation server. The aggregation server transmits the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization to the third-party audit server.

[0055] The local dataset described in step 2 consists of multiple contaminated samples and multiple uncontaminated samples;

[0056] The contaminated samples included: 50 requests containing attacks, each labeled as not containing attacks; and 50 requests not containing attacks, each labeled as containing attacks.

[0057] The uncontaminated samples consist of: 500 requests containing attacks, each labeled as containing an attack; and 500 requests not containing attacks, each labeled as not containing an attack.

[0058] Step 3: The third-party audit server constructs a training set using multiple uncontaminated samples, and trains the federated learning model using the federated learning gradient algorithm to obtain the parameter update vector of the federated learning model after training in this iteration. The credibility evaluation value and normalization factor of each federated learning client in this iteration are updated according to the random sampling method and transmitted to the aggregation server.

[0059] Step 3 involves updating the parameter vector weights of each federated learning client in this iteration of optimization using a random sampling method, as detailed below:

[0060] If the current iteration of optimization is not selected by random sampling, the credibility evaluation value and normalization factor of each federated learning client in the current iteration of optimization are updated by using the credibility evaluation value and normalization factor of each federated learning client in the previous iteration of optimization.

[0061] If the current iteration of optimization is selected by random sampling, the credibility evaluation value and normalization factor of each client are calculated by combining the updated parameter vector of the trained model of the third-party audit server with the updated parameter vector of the trained federated learning model of each client. That is, the credibility evaluation value and normalization factor of each federated learning client in this iteration of optimization are calculated by the cosine similarity calculation model.

[0062] The parameter vector weights of each federated learning client in this iteration of optimization are calculated using a cosine similarity calculation model, as follows:

[0063]

[0064] i∈[1,n], r∈[1,R]

[0065] in, Indicates the first r Round of iteration i The cosine of a federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations express l 2-norm computation, where R represents the maximum number of iterations. n=20 The number of federated learning clients;

[0066] The credibility assessment value for each federated learning client is calculated as follows:

[0067]

[0068]

[0069] Where RELU(*) represents the clipping calculation, and x represents the variable used in the clipping calculation. Indicates the first r Round of iteration i The credibility evaluation value of each federated learning client;

[0070] The normalization factor for each federated learning client is calculated as follows:

[0071]

[0072] in, Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations Indicates the first r Round of iteration i Normalization factor of each federated learning client;

[0073] Will , Send to the aggregation server;

[0074] Step 4: The aggregation server combines the updated federated learning model after training, the credibility evaluation value, and the normalization factor of each federated learning client in this iteration to calculate the global model parameter vector of the federated learning in this iteration through weighted average. The aggregation server then uses the global model parameter vector of the federated learning as the initialization parameter vector of the federated learning model in the next iteration.

[0075] Step 4 describes the calculation of the global model parameter vector for federated learning in this iteration of optimization using a weighted average, as follows:

[0076]

[0077] in, n For the number of Federated Learning training clients, Indicates the first r Round of iteration i The credibility assessment value of each federated learning client. Indicates the first r Round of iteration i Normalization factor of each federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. For the first r The global model parameter vector of the federated learning in the first iteration of optimization will be used as the initialization parameter vector of the federated learning model in the next iteration of optimization, i.e., the first... r +1 Initialization parameter vector of the federated learning model in iterative optimization of the federated learning model;

[0078] Step 5: Iterative optimization. Execute steps 1 to 4 until the maximum number of iterations is reached, and obtain the iteratively optimized federated learning model parameter vector for each federated learning client, the iteratively optimized federated learning model parameter vector for the third-party audit server, and the iteratively optimized global federated learning model parameter vector for the aggregation server.

[0079] Step 6: Each federated learning client constructs the corresponding iteratively optimized federated learning model, i.e., the optimized BP neural network, based on the parameter vector of the iteratively optimized federated learning model.

[0080] Step 7: The federated learning client receives the access request, predicts the type of the access request using the optimized BP neural network, and allows access if the access request does not contain an attack; otherwise, it denies access.

[0081] It should be understood that any parts not described in detail in this specification belong to the prior art.

[0082] Although this paper uses terms such as federated learning client, third-party audit server, and aggregation server frequently, the possibility of using other terms is not excluded. These terms are used merely for the convenience of describing the essence of this invention, and interpreting them as any additional limitation would contradict the spirit of this invention.

[0083] It should be understood that the above description of the preferred embodiments is quite detailed, but it should not be considered as a limitation on the scope of protection of this invention. Those skilled in the art, under the guidance of this invention, can make substitutions or modifications without departing from the scope of protection of the claims of this invention, and all such substitutions or modifications fall within the scope of protection of this invention. The scope of protection of this invention should be determined by the appended claims.

Claims

1. A public auditing Byzantine robust federated learning system, characterized in that, include: Multiple federated learning clients, third-party audit servers, and aggregation servers; The aggregation server is connected sequentially to multiple federated learning clients; The aggregation server is connected to the third-party audit server; This process involves constructing initial local federated learning models for each federated learning client and the third-party audit server in this iteration. The federated learning models are trained using the federated learning gradient algorithm to obtain the parameters of each client's trained federated learning model, which are then uploaded to the aggregation server and the third-party audit server. The federated learning model is then trained again using the federated learning gradient algorithm to obtain the updated parameter vector for the third-party audit server's trained federated learning model. The credibility evaluation value and normalization factor of each client are updated using random sampling and transmitted to the aggregation server. The aggregation server calculates the global federated learning model parameter vector using a weighted average and uses it as the initial parameter vector for the next iteration. Iterative optimization yields the iteratively optimized federated learning model parameter vectors for each client, the third-party audit server, and the aggregation server. Finally, the corresponding iteratively optimized federated learning model is constructed based on these parameter vectors.

2. A Byzantine robust federated learning method for public auditing using the Byzantine robust federated learning system for public auditing as described in claim 1, characterized in that, Includes the following steps: Step 1: Construct the initial local federated learning model for each federated learning client in this iteration optimization, and the initial local federated learning model for the third-party audit server in this iteration optimization; Step 2: Each federated learning client uses the local dataset as the training set and combines the federated learning gradient algorithm to train the federated learning model, obtaining the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization, i.e., the gradient, and uploads it to the aggregation server. The aggregation server transmits the parameter update vector of the federated learning model after training for each federated learning client in this iteration optimization to the third-party audit server. Step 3: The third-party audit server constructs a training set using multiple uncontaminated samples, and trains the federated learning model using the federated learning gradient algorithm to obtain the parameter update vector of the federated learning model after training in this iteration. The credibility evaluation value and normalization factor of each federated learning client in this iteration are updated according to the random sampling method and transmitted to the aggregation server. Step 4: The aggregation server combines the updated federated learning model after training, the credibility evaluation value, and the normalization factor of each federated learning client in this iteration to calculate the global model parameter vector of the federated learning in this iteration through weighted average. The aggregation server then uses the global model parameter vector of the federated learning as the initialization parameter vector of the federated learning model in the next iteration. Step 5: Iterative optimization. Execute steps 1 to 4 until the maximum number of iterations is reached, and obtain the iteratively optimized federated learning model parameter vector for each federated learning client, the iteratively optimized federated learning model parameter vector for the third-party audit server, and the iteratively optimized global federated learning model parameter vector for the aggregation server. Step 6: Each federated learning client constructs the corresponding iteratively optimized federated learning model based on the parameter vector of the iteratively optimized federated learning model.

3. The Byzantine robust federated learning method for public auditing according to claim 2, characterized in that, Step 1 describes the construction of the initial local federated learning model for each federated learning client in this iterative optimization, as follows: The aggregation server distributes the initialization parameter vector of the federated learning model in this iteration to each federated learning client. Each federated learning client constructs its own initial local federated learning model in this iteration based on the initialization parameter vector of the federated learning model in this iteration. Step 1 describes the construction of the initial local federated learning model for the third-party audit server in this iteration of optimization, as follows: The aggregation server sequentially sends the initialization parameter vector of the federated learning model in this iteration to the third-party audit server. The third-party audit server then constructs the initialization local federated learning model of the third-party audit server in this iteration based on the initialization parameter vector of the federated learning model in this iteration.

4. The Byzantine robust federated learning method for public auditing according to claim 3, characterized in that, The local dataset described in step 2 consists of multiple contaminated samples and multiple uncontaminated samples.

5. The Byzantine robust federated learning method for public auditing according to claim 4, characterized in that, Step 3 involves updating the credibility evaluation value and normalization factor of each federated learning client in this iteration of optimization using a random sampling method, as detailed below: If the current iteration of optimization is not selected by random sampling, the credibility evaluation value and normalization factor of each federated learning client in the current iteration of optimization are updated by using the credibility evaluation value and normalization factor of each federated learning client in the previous iteration of optimization. If the current iteration of optimization is selected using the random sampling method, the credibility evaluation value and normalization factor of each client are calculated by combining the post-training model parameter update vector of the third-party audit server with the post-training federated learning model parameter update vector of each federated learning client. That is, the credibility evaluation value of each federated learning client in this iteration of optimization is calculated by using the cosine similarity calculation model, and the normalization factor of each federated learning client is calculated by combining the credibility evaluation value of each federated learning client.

6. The Byzantine robust federated learning method for public auditing according to claim 5, characterized in that, The credibility evaluation value of each federated learning client in this iteration of optimization is calculated using the cosine similarity calculation model, as follows: i∈[1,n], r∈[1,R] in, Indicates the first r Round of iteration i The cosine of a federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations express l 2-norm computation, where R represents the maximum number of iterations. n The number of federated learning clients; The credibility assessment value for each federated learning client is calculated as follows: Where RELU(*) represents the clipping calculation, and x represents the variable used in the clipping calculation. Indicates the first r Round of iteration i The credibility assessment value of each federated learning client.

7. The Byzantine robust federated learning method for public auditing according to claim 6, characterized in that, The normalization factor for each federated learning client is calculated by combining the credibility evaluation value of each client, as follows: in, Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. This indicates that the third-party audit server is in the... r Model parameter updates during round iterations Indicates the first r Round of iteration i Normalization factor of each federated learning client; will be sent to the aggregation server. , 8. The Byzantine robust federated learning method for public auditing according to claim 7, characterized in that, Step 4 describes the calculation of the global model parameter vector for federated learning in this iteration of optimization using a weighted average, as follows: in, n For the number of Federated Learning training clients, Indicates the first r Round of iteration i The credibility assessment value of each federated learning client. Indicates the first r Round of iteration i Normalization factor of each federated learning client Indicates the first r Round of iteration i The local model update of a federated learning client is the model parameter update vector. For the first r The global model parameter vector of the federated learning in the first iteration of optimization will be used as the initialization parameter vector of the federated learning model in the next iteration of optimization, i.e., the first... r The initialization parameter vector of the federated learning model in the +1 iterative optimization.