A random response entropy-based federated learning model poisoning detection method

By constructing a random response entropy index and adaptive aggregation weight optimization in federated learning, combined with cluster analysis, the accuracy problem of malicious client identification in non-independent and identically distributed scenarios is solved, and more stable and secure model updates are achieved.

CN122241768APending Publication Date: 2026-06-19ZHEJIANG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG UNIV
Filing Date
2026-04-09
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing federated learning methods struggle to accurately identify malicious clients in non-independent, identically distributed scenarios, leading to a decrease in global model accuracy during poisoning attacks. Furthermore, existing defense methods rely on auxiliary verification data or statistical features, limiting their application scope.

Method used

By constructing a random response entropy index, the predictive behavior of client models on random data is analyzed. Combined with adaptive aggregation weight optimization and cluster analysis, malicious clients are identified and robustly aggregated, forming an integrated malicious client identification and model update process.

Benefits of technology

It improves the accuracy of malicious client identification and the robustness of global model aggregation in non-independent and identically distributed scenarios, is applicable to various federated learning scenarios, and enhances the stability and security of the model.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241768A_ABST
    Figure CN122241768A_ABST
Patent Text Reader

Abstract

A method for detecting poisoning in a federated learning model based on random response entropy includes: acquiring local model update results uploaded by each client during federated learning training; acquiring prediction results of each client model based on a random dataset and calculating the corresponding random response entropy; initializing client aggregation weights based on the random response entropy and adaptively optimizing the aggregation weights by combining class distribution consistency and sample prediction unbiasedness; performing cluster analysis on the optimized client aggregation weights to identify potential malicious clients, and completing global model aggregation update after removing malicious clients. This invention constructs random response entropy as the basis for malicious client identification and combines adaptive aggregation weight optimization and clustering discrimination mechanisms to achieve malicious client identification and robust aggregation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of distributed machine learning technology, specifically to a poisoning detection method based on a federated learning model using stochastic response entropy. Background Technology

[0002] Federated learning, a distributed machine learning technique, collaboratively trains a global model through multiple clients without directly sharing raw data, thus balancing model training effectiveness with data privacy protection. Because it can leverage the value of data from multiple parties while ensuring data security, it has been widely applied in scenarios with high data security and privacy requirements, such as smart terminals, healthcare, and financial risk control. However, due to its distributed participation and the inability of the server to directly supervise the local training process of clients, federated learning is vulnerable to model poisoning attacks. Malicious clients can tamper with model update results during local training or parameter upload, submitting misleading or destructive model parameters to the server, thereby interfering with the optimization direction of the global model, reducing model accuracy, or even causing the global model to fail.

[0003] To address the aforementioned issues, defense techniques against poisoning attacks on federated learning models have been developed. These techniques aim to reduce the impact of malicious updates on the global model training process by identifying malicious clients, filtering abnormal model updates, or designing robust aggregation strategies, thereby improving the security and stability of federated learning systems. Existing methods typically rely on a variety of techniques, including statistical analysis, distance metrics, anomaly detection, and auxiliary verification, to screen and weight model updates uploaded by clients. These methods can mitigate the security threats posed by malicious clients to some extent and maintain the normal training functionality and model performance of the federated learning system as much as possible.

[0004] While these analytical techniques help improve the ability of federated learning systems to defend against model poisoning attacks, existing methods usually rely on auxiliary verification data or on statistical features and distance relationships between client updates to identify anomalous models. However, in scenarios where clients are not independent and identically distributed, benign clients may also produce significantly deviated model updates due to large differences in local data distribution, which can easily be confused with malicious updates, affecting detection accuracy and aggregation results. Summary of the Invention

[0005] This invention aims to overcome the problem of insufficient accuracy in identifying abnormal models in non-independent and identically distributed scenarios using federated learning in existing technologies, and proposes a poisoning detection method based on stochastic response entropy federated learning models.

[0006] A poisoning detection method based on a federated learning model using stochastic response entropy, comprising the following steps:

[0007] Step 1: Perform federated learning training to provide local client models for subsequent malicious client identification and robust aggregation: The server initializes the global model and distributes the global model parameters to each client. After each client completes its training based on its local data, it uploads the updated local model parameters to the server.

[0008] Step 2: Obtain the random response entropy: Construct a random dataset on the server side that is unrelated to the original training task, analyze the prediction behavior of the models uploaded by each client, and obtain the random response entropy of each client by calculating the prediction category distribution entropy value of the model on the random data.

[0009] Step 3 Adaptive optimization of client aggregation weights: First, initialize the client aggregation weights based on the random response entropy. Then, combine the prediction behavior of the global model on random data to further adaptively optimize the client aggregation weights, thereby improving the robustness and accuracy of global model aggregation.

[0010] Step 4: Identify malicious clients and update the global model: By performing cluster analysis on the optimized client aggregation weights, potential malicious clients are identified. After removing malicious clients, the remaining client models are weighted and aggregated to obtain a new global model.

[0011] Unlike existing technologies that mainly rely on auxiliary verification data, parameter statistical features, or gradient distance relationships to identify abnormal clients, this invention introduces the client model's predictive response behavior on random data into the federated learning model's poisoning defense process. By analyzing the diversity of the predicted category distribution, a random response entropy index is constructed, which serves as an important basis for differentiation.

[0012] An adaptive robust aggregation method based on random response entropy: This invention not only uses random response entropy to identify anomalies in clients, but also uses random response entropy as the initialization basis for client aggregation weights. Combining the two constraint objectives of class distribution consistency and sample prediction unbiasedness, the client aggregation weights are adaptively optimized, thereby realizing the integrated design of client contribution evaluation and global model robust aggregation.

[0013] Wide applicability: The technical solution of this invention is applicable to a variety of federated learning scenarios. It can effectively identify malicious clients for different numbers of clients, different degrees of data heterogeneity, different model structures, and different model poisoning attack settings.

[0014] To address model poisoning attacks in federated learning scenarios, existing defense methods typically rely on auxiliary validation data or anomaly identification based on statistical features and distance relationships between client updates. While these methods can mitigate the interference of malicious clients on global model training to some extent, they are prone to confusing benign clients with malicious ones in non-independent and identically distributed scenarios, and their dependence on task-related auxiliary validation data limits their application scope. The method proposed in this invention analyzes the predictive response characteristics of client models on random data, constructs a random response entropy as a basis for malicious client identification, and combines adaptive aggregation weight optimization and clustering discrimination mechanisms to achieve malicious client identification and robust aggregation. This method can be applied to defend against model poisoning attacks in federated learning systems.

[0015] Compared with existing technical solutions, the technical solution of the present invention has the following obvious advantages:

[0016] The invention introduces a stochastic response entropy discrimination mechanism. It incorporates the client model's prediction behavior on random data into the poisoning defense process of federated learning models. By constructing a stochastic response entropy index, it quantifies the diversity of the client model's prediction category distribution. Unlike existing methods that primarily rely on auxiliary validation data, parameter statistical features, or gradient distance relationships to identify abnormal updates, this invention establishes a malicious client identification criterion at the model response behavior level, transforming malicious client detection into a discrimination process based on the prediction distribution characteristics under random input.

[0017] This invention not only utilizes random response entropy to identify anomalies in clients, but also uses random response entropy as the initialization basis for client aggregation weights, and adaptively optimizes the aggregation weights by combining class distribution consistency and sample prediction unbiasedness. Unlike existing technologies that directly assign weights based on the number of local samples or rely solely on a single anomaly score to filter clients, this invention combines client behavior evaluation with aggregation weight learning, forming a dynamic aggregation method driven by predicted response features.

[0018] After optimizing the client aggregation weights, this invention further performs cluster analysis on the optimized client aggregation weights and determines whether malicious clients exist by combining the relationship between inter-cluster distance and intra-cluster distance. Based on this, it decides whether to retain all clients for aggregation or to remove abnormal clients before weighted aggregation. Unlike existing methods that directly remove abnormal clients based on fixed thresholds or simple sorting methods, this invention combines the clustering discrimination mechanism with the final aggregation mechanism, forming an integrated processing flow of "weight optimization—cluster discrimination—selective aggregation". Attached Figure Description

[0019] Figure 1 This is a flowchart of the method of the present invention.

[0020] Figure 2 This is a schematic diagram of the adaptive optimization of the aggregation weights in this invention. Detailed Implementation

[0021] The technical solution of the present invention will be further described below with reference to the accompanying drawings.

[0022] This embodiment relates to a poisoning detection method based on a federated learning model using stochastic response entropy, such as... Figure 1 As shown, there are four main steps: 1. Obtain the local model update results uploaded by each client during the federated learning training process. 2. Obtain the prediction results of each client model based on a random dataset and calculate the corresponding stochastic response entropy. 3. Initialize the client aggregation weights based on the stochastic response entropy, and adaptively optimize the aggregation weights by combining class distribution consistency and sample prediction unbiasedness. 4. Perform cluster analysis on the optimized client aggregation weights to identify potential malicious clients, and complete the global model aggregation update after removing malicious clients.

[0023] To achieve the above objectives, this invention proposes a poisoning detection method based on a federated learning model using stochastic response entropy, the specific steps of which are as follows:

[0024] Step 1: Federated Learning Training and Local Model Acquisition on the Client

[0025] To obtain the local model update results of each client in the current communication round in the federated learning system, the basic training process of federated learning needs to be completed first. The server initializes the global model and distributes the global model parameters to each client. After each client completes its training based on its local data, it uploads the updated local model parameters to the server. This step is used to provide the client model input to be analyzed for subsequent malicious client identification and robust aggregation.

[0026] The steps for federated learning training and obtaining the local model on the client side are as follows:

[0027] Step 1-1: Initial state, the server initializes the global model w0 and broadcasts the global model parameters for the current round to all clients participating in training;

[0028] Steps 1-2: Each client receives the global model parameters sent by the server and performs local model training based on its own local training data to obtain the client model updated in this round;

[0029] Steps 1-3: Each client uploads the updated local model parameters to the server;

[0030] Steps 1-4: The server receives all model update results uploaded by clients, which serve as input for subsequent random response analysis and robust aggregation.

[0031] Step 2: Obtaining the random response entropy

[0032] This invention constructs a random dataset on the server side, unrelated to the original training task, analyzes the prediction behavior of models uploaded by each client, and obtains the random response entropy for each client by calculating the prediction category distribution entropy value of the model on random data. Random response entropy is used to characterize the prediction diversity of client models under random input, thus providing a basis for distinguishing between benign and malicious clients.

[0033] The steps to obtain the random response entropy are as follows:

[0034] Step 2-1: Initial state, the server constructs a random dataset R={r1,r2,...,r k The random dataset does not depend on the original training data of the federated learning task, nor does it require true class labels;

[0035] Step 2-2: Input the random dataset R into each client's uploaded model in sequence to obtain the prediction results of each client's model on the random dataset;

[0036] Steps 2-3: For the nth client model, count the frequency of each category label in its predicted category output sequence on the random dataset to obtain the corresponding category distribution;

[0037] Steps 2-4: Calculate the random response entropy E of the nth client model based on the frequency of occurrence of each category label. n ;

[0038] Steps 2-5: Iterate through all client models and output the final set of random response entropies;

[0039] Step 3 Adaptive optimization of client-side aggregate weights

[0040] After obtaining the random response entropy corresponding to each client model, this invention first initializes the client aggregation weights based on the random response entropy, and then further adaptively optimizes the client aggregation weights by combining the prediction behavior of the global model on random data, thereby improving the robustness and accuracy of global model aggregation.

[0041] like Figure 2 As shown, the adaptive optimization steps for client-side aggregate weights are as follows:

[0042] Step 3-1: Initial state, input the client model set and the corresponding random response entropy set;

[0043] Step 3-2: Calculate the initial aggregate weight set for each client model based on its random response entropy. The higher the random response entropy of a client model, the larger its initial aggregate weight.

[0044] Step 3-3: Perform weighted aggregation on the client model based on the current client aggregation weight to obtain a temporary global model;

[0045] Steps 3-4: Input the random dataset into the temporary global model and calculate the class distribution consistency index and sample prediction unbiasedness index of the global model on the random data;

[0046] Steps 3-5: Construct a joint optimization objective based on the category distribution consistency index and the sample prediction unbiasedness index, and iteratively update the client-side aggregate weights;

[0047] Step 3-6: Repeat steps 3-3 to 3-5 until the preset number of iterations is reached or the aggregate weights converge, and output the final optimized client aggregate weight set.

[0048] Step 4: Malicious Client Identification and Global Model Update

[0049] This invention identifies potential malicious clients by performing cluster analysis on the optimized client aggregation weights, and then performs weighted aggregation on the remaining client models after removing malicious clients to obtain a new global model. This step can prevent malicious client models from interfering with the optimization direction of the global model.

[0050] The steps for malicious client identification and global model update are as follows:

[0051] Step 4-1: Initial state, input the optimized client aggregate weight set and client model set;

[0052] Step 4-2: Perform cluster analysis on the client aggregate weight set to divide all clients into candidate benign client clusters and candidate abnormal client clusters. The cluster with the smaller overall aggregate weight is denoted as the candidate abnormal client cluster.

[0053] Step 4-3: Compare the minimum distance between clusters with the maximum distance within clusters in the clustering results. When the separation between clusters is not obvious, it is determined that there are no significantly malicious clients in the current round, and all clients are retained to participate in the aggregation. When the separation between clusters is obvious, the clients corresponding to the candidate abnormal client clusters are identified as potential malicious clients.

[0054] Step 4-4: After identifying potentially malicious clients, remove them from the aggregate set, and only perform normalized weighted aggregation on the remaining benign client models according to the optimized aggregation weights to obtain a new global model;

[0055] Steps 4-5: Distribute the new global model to each client to begin the next round of federated learning training;

[0056] Steps 4-6: Repeat steps 1 to 4 until the preset number of communication rounds is reached or the model training ends, and output the final global model.

[0057] The embodiments described in this specification are merely examples of implementations of the inventive concept. The scope of protection of this invention should not be considered as limited to the specific forms stated in the embodiments. The scope of protection of this invention also extends to equivalent technical means that can be conceived by those skilled in the art based on the inventive concept.

Claims

1. A poisoning detection method based on a federated learning model using stochastic response entropy, comprising the following steps: Step 1: Perform federated learning training to provide local client models for subsequent malicious client identification and robust aggregation: The server initializes the global model and distributes the global model parameters to each client. After each client completes its training based on its local data, it uploads the updated local model parameters to the server. Step 2: Obtain the random response entropy: Construct a random dataset on the server side that is unrelated to the original training task, analyze the prediction behavior of the models uploaded by each client, and obtain the random response entropy of each client by calculating the prediction category distribution entropy value of the model on the random data. Step 3 Adaptive optimization of client aggregation weights: First, initialize the client aggregation weights based on the random response entropy. Then, combine the prediction behavior of the global model on random data to further adaptively optimize the client aggregation weights, thereby improving the robustness and accuracy of global model aggregation. Step 4: Identify malicious clients and update the global model: By performing cluster analysis on the optimized client aggregation weights, potential malicious clients are identified. After removing malicious clients, the remaining client models are weighted and aggregated to obtain a new global model.

2. The method as described in claim 1, characterized in that, Step 1 specifically includes: Step 1-1: Initial state, the server initializes the global model w0 and broadcasts the global model parameters for the current round to all clients participating in training; Steps 1-2: Each client receives the global model parameters sent by the server and performs local model training based on its own local training data to obtain the client model updated in this round; Steps 1-3: Each client uploads the updated local model parameters to the server; Steps 1-4: The server receives all model update results uploaded by clients, which serve as input for subsequent random response analysis and robust aggregation.

3. The method as described in claim 1, characterized in that, Step 2 specifically includes: Step 2-1: Initial state, the server constructs a random dataset R={r1,r2,...,r k The random dataset does not depend on the original training data of the federated learning task, nor does it require true class labels; Step 2-2: Input the random dataset R into each client's uploaded model in sequence to obtain the prediction results of each client's model on the random dataset; Steps 2-3: For the nth client model, count the frequency of each category label in its predicted category output sequence on the random dataset to obtain the corresponding category distribution; Steps 2-4: Calculate the random response entropy E of the nth client model based on the frequency of occurrence of each category label. n ; Steps 2-5: Iterate through all client models and output the final set of random response entropy.

4. The method as described in claim 1, characterized in that, Step 3 specifically includes: Step 3-1: Initial state, input the client model set and the corresponding random response entropy set; Step 3-2: Calculate the initial aggregate weight set for each client model based on its random response entropy. The higher the random response entropy of a client model, the larger its initial aggregate weight. Step 3-3: Perform weighted aggregation on the client model based on the current client aggregation weight to obtain a temporary global model; Steps 3-4: Input the random dataset into the temporary global model and calculate the class distribution consistency index and sample prediction unbiasedness index of the global model on the random data; Steps 3-5: Construct a joint optimization objective based on the category distribution consistency index and the sample prediction unbiasedness index, and iteratively update the client-side aggregate weights; Step 3-6: Repeat steps 3-3 to 3-5 until the preset number of iterations is reached or the aggregate weights converge, and output the final optimized client aggregate weight set.

5. The method as described in claim 1, characterized in that, Step 4 specifically includes: Step 4-1: Initial state, input the optimized client aggregate weight set and client model set; Step 4-2: Perform cluster analysis on the client aggregate weight set to divide all clients into candidate benign client clusters and candidate abnormal client clusters. The cluster with the smaller overall aggregate weight is denoted as the candidate abnormal client cluster. Step 4-3: Compare the minimum distance between clusters with the maximum distance within clusters in the clustering results. When the separation between clusters is not obvious, it is determined that there are no significantly malicious clients in the current round, and all clients are retained to participate in the aggregation. When the separation between clusters is obvious, the clients corresponding to the candidate abnormal client clusters are identified as potential malicious clients. Step 4-4: After identifying potentially malicious clients, remove them from the aggregate set, and only perform normalized weighted aggregation on the remaining benign client models according to the optimized aggregation weights to obtain a new global model; Steps 4-5: Distribute the new global model to each client to begin the next round of federated learning training; Steps 4-6: Repeat steps 1 to 4 until the preset number of communication rounds is reached or the model training ends, and output the final global model.