A backdoor defense method and system based on clustering and weighted aggregation

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using density clustering and weighted aggregation, the input samples are divided into single-class and multi-class sample sets. The model is trained and a reliable sample set is selected to build the final model, which solves the problem of defense failure against dense point cluster backdoor attacks and improves detection accuracy and defense efficiency.

CN122241691APending Publication Date: 2026-06-19INFORMATION & COMMNUNICATION BRANCH STATE GRID JIANGXI ELECTRIC POWER CO

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: INFORMATION & COMMNUNICATION BRANCH STATE GRID JIANGXI ELECTRIC POWER CO
Filing Date: 2026-03-13
Publication Date: 2026-06-19

Smart Images

Figure CN122241691A_ABST

Patent Text Reader

Abstract

This invention provides a backdoor defense method and system based on clustering and weighted aggregation, relating to the field of artificial intelligence security. The method includes: performing cluster analysis on input samples using a density clustering algorithm, dividing the input samples into single-class sample sets and multi-class sample sets based on the clustering results; training a first classification model based on the single-class sample sets; predicting the multi-class sample sets based on the first classification model, selecting samples whose feature vectors are similar to their sample labels, and constructing a reliable sample set; training a second classification model based on the reliable sample set; and performing weighted aggregation on the first and second classification models to obtain the final model. This invention achieves accurate identification and filtering of abnormal samples even when backdoor trigger points are densely distributed through clustering and weighted aggregation.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence security, and in particular to a backdoor defense method and system based on clustering and weighted aggregation. Background Technology

[0002] With the rapid development of 3D perception technology, point cloud-based deep learning models have been widely applied in fields such as autonomous driving target detection and industrial parts sorting. However, point cloud data, characterized by its unstructured, disordered, rotation-invariant, and high-dimensional sparsity, faces unique backdoor attack threats. Attackers can implant backdoors into the model during training by embedding specific geometric patterns (such as local point clusters) as triggers, thereby threatening system security.

[0003] Existing defense techniques typically employ the K-nearest neighbor algorithm to detect and filter outliers in the input samples, followed by classification of the processed samples. While this method is effective against backdoor attacks based on sporadic point insertions, it has significant limitations: such defense mechanisms are prone to failure when attackers design triggers as dense, small-scale point clusters. In real-world scenarios, dense point cluster triggers are more stealthy and threatening because they can be disguised as normally collected point cloud data from an object's surface, making them easier to activate and harder to detect. Therefore, there is an urgent need to provide a solution to address these issues. Summary of the Invention

[0004] The purpose of this invention is to provide a backdoor defense method and system based on clustering and weighted aggregation that can improve the problem of failure of existing dense point cluster type trigger defenses.

[0005] In a first aspect, the present invention provides a backdoor defense method based on clustering and weighted aggregation, comprising:

[0006] The input samples are clustered based on the density clustering algorithm, and the input samples are divided into single-class sample sets and multi-class sample sets according to the clustering results.

[0007] A first classification model is trained based on the single-class sample set; a prediction is made on the multi-class sample set based on the first classification model, and samples with similar feature vectors and sample labels are selected to construct a reliable sample set; a second classification model is trained based on the reliable sample set.

[0008] The first classification model and the second classification model are weighted and aggregated to obtain the final model.

[0009] This invention provides a backdoor defense method based on clustering and weighted aggregation. It performs cluster analysis on samples using a density-based clustering algorithm, dividing the samples into single-class and multi-class sets based on the clustering results. A first classification model is trained using the single-class set, and the multi-class samples are then classified and predicted using this first model to construct a reliable sample set. A second classification model is trained based on this reliable sample set. Finally, the first and second classification models are weighted and aggregated to obtain the final model, enabling accurate identification and filtering of abnormal samples even when backdoor trigger points are densely distributed.

[0010] Optionally, the input data may be preprocessed by normalization before cluster analysis.

[0011] Optionally, when performing cluster analysis on input samples based on a density clustering algorithm, the density clustering algorithm includes a density-based noise-applied spatial clustering algorithm, which defines core points by setting a minimum number of points in the neighborhood.

[0012] Optionally, when dividing the input samples into single-class sample sets and multi-class sample sets based on the clustering results, the following steps are included: classifying sample points that belong to and only belong to the same core cluster into single-class sample sets; classifying sample points that are identified as noise points by the density clustering algorithm or boundary points belonging to two or more clusters into multi-class sample sets.

[0013] Optionally, when predicting the multi-class sample set based on the first classification model and selecting samples whose feature vectors are similar to the sample labels, the method includes: for samples in the multi-class sample set, obtaining the feature vectors output by the first classification model; determining whether the true label of the sample is one of the category indices corresponding to the two elements with the largest values in the feature vector; if so, determining that the feature vector of the sample is similar to the sample label and recording it as a reliable sample.

[0014] Optionally, the weighted aggregation adopts a federated average algorithm; the aggregation weight of the federated average algorithm is dynamically determined according to the ratio of the number of samples in the single-class sample set and the trusted sample set.

[0015] Optionally, after obtaining the final model, the model iterative optimization step is also included:

[0016] Step 1: Use the aforementioned credible sample set as the initial high-confidence sample set;

[0017] Step 2: Based on the final model, predict the remaining samples that have not yet been selected into the single-class sample set and the high-confidence sample set, and filter out samples whose feature vectors and sample labels are similar to those of the samples, and add them to the high-confidence sample set.

[0018] Step 3: Continue training the second classification model based on the updated high-confidence sample set; then perform a weighted aggregation of the first classification model and the retrained second classification model to obtain a new global model.

[0019] Step 4: Repeat steps 2 to 3, and use the new global model generated in each aggregation for the next round of screening, until the number of new samples added in the continuous iteration is lower than the preset threshold or the number of iterations reaches the preset upper limit; finally, the global model generated in the last round is used as the final model.

[0020] Secondly, the present invention provides a backdoor defense system based on clustering and weighted aggregation, comprising:

[0021] The sample classification module is used to perform cluster analysis on the input samples based on the density clustering algorithm, and divide the input samples into single-class sample sets and multi-class sample sets according to the clustering results;

[0022] The model training module is used to train a first classification model based on the single-class sample set; predict the multi-class sample set based on the first classification model, select samples whose feature vectors are similar to the sample labels, and construct a reliable sample set; and train a second classification model based on the reliable sample set.

[0023] The model aggregation module is used to perform weighted aggregation on the first classification model and the second classification model to obtain the final model.

[0024] Thirdly, the present invention provides a computer device, characterized in that it includes: a memory, a processor, and a bus system;

[0025] The memory is used to store programs;

[0026] The processor is used to execute the program in the memory, and the processor is used to execute the backdoor defense method based on clustering and weighted aggregation according to the instructions in the program code;

[0027] The bus system is used to connect the memory and the processor to enable communication between the memory and the processor.

[0028] Fourthly, the present invention provides a storage medium, characterized in that the storage medium stores one or more programs, which, when executed by a processor, implement a backdoor defense method based on clustering and weighted aggregation as described in any one of claims 1 to 6.

[0029] Compared with the prior art, the present invention has the following advantages:

[0030] 1. Significantly improved detection accuracy for point-set backdoor triggers;

[0031] 2. A preprocessing mechanism is adopted to complete the detection before model training, eliminating the need for additional classifier training or labeled samples;

[0032] 3. Combining a weighted aggregation framework with gradient aggregation effectively reduces computational resource consumption and improves defense efficiency;

[0033] 4. It breaks through the dependence of traditional backdoor detection methods on pre-labeled datasets; it overcomes the limitation of existing backdoor sample detection technologies requiring pre-trained models. Attached Figure Description

[0034] Figure 1 A flowchart illustrating a backdoor defense method based on clustering and weighted aggregation, provided for an embodiment of the present invention;

[0035] Figure 2 This is a structural diagram of a backdoor defense system based on clustering and weighted aggregation, provided in an embodiment of the present invention. Detailed Implementation

[0036] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention. Unless otherwise defined, the technical or scientific terms used herein should have the ordinary meaning understood by those skilled in the art. The terms "comprising" and similar expressions used herein mean that the element or object preceding the word covers the element or object listed after the word and its equivalents, but does not exclude other elements or objects.

[0037] See Figure 1 This invention provides a backdoor defense method based on clustering and weighted aggregation, comprising the following steps:

[0038] S1. Perform cluster analysis on the input samples based on the density clustering algorithm, and divide the input samples into single-class sample sets and multi-class sample sets according to the clustering results;

[0039] S2. Train the first classification model based on a single-class sample set; predict the multi-class sample set based on the first classification model, select samples whose feature vectors are similar to the sample labels, and construct a reliable sample set; train the second classification model based on the reliable sample set.

[0040] S3. Perform weighted aggregation on the first classification model and the second classification model to obtain the final model.

[0041] In some embodiments, in step S1, the input samples are clustered using a density-based clustering algorithm. When dividing the input samples into single-class and multi-class sample sets based on the clustering results, the input samples undergo normalization preprocessing to unify the coordinate scale. Subsequently, the density-based noise applied spatial clustering (DBSCAN) algorithm is used to perform density clustering analysis on the normalized samples. The minimum number of neighborhood points parameter can be set to 10. This parameter enables the algorithm to automatically identify outliers with fewer than 10 samples in their neighborhood as noise, thus simultaneously implementing the denoising function of the K-nearest neighbors algorithm during the clustering process while maintaining complete clustering of normal sample distributions. Based on the clustering results of the DBSCAN algorithm, sample points belonging to only one core cluster are assigned to the single-class sample set; points identified as noise by the algorithm or boundary points belonging to multiple clusters are assigned to the multi-class sample set. The input samples can be point cloud data. The core of the DBSCAN algorithm is defining arbitrary sample points. The neighborhood of is calculated using the following formula:

[0042]

[0043] in, Indicates the input sample set, Represents sample points and sample points The Euclidean distance between them This represents the neighborhood radius threshold. If the number of samples contained in the neighborhood of a sample point is not less than the preset minimum neighborhood point threshold, it is defined as a core point. A cluster is formed by the core points and the points densely connected to them. Points that do not belong to the neighborhood of any core point are identified as noise points. Through this mechanism, the algorithm can achieve cluster division and noise filtering based on the density characteristics of the sample distribution.

[0044] Those skilled in the art will understand that the parameter can be adaptively adjusted according to the size and distribution characteristics of the dataset, as long as it can achieve the technical effect of "simultaneously filtering noise in clustering".

[0045] This invention fully utilizes the physical structural characteristics of point cloud data: point clouds consist of discrete, disordered points that are compactly distributed within a local space. Density-based clustering algorithms (such as DBSCAN) can effectively identify such locally high-density, compact core point sets. In contrast, typical backdoor attacks based on insertion points usually inject backdoor trigger points that deviate from the dense distribution area of normal samples, appearing as discrete outliers in space. Therefore, the DBSCAN clustering process can naturally identify and filter out such abnormal backdoor points based on density thresholds, thus directly obtaining a high-confidence, uncontaminated clean sample set during the preprocessing stage.

[0046] In some embodiments, in step S2, when predicting the multi-class sample set based on the first classification model and filtering out samples whose feature vectors are similar to the sample labels, for samples in the multi-class sample set, the feature vector output by the first classification model is obtained; it is determined whether the true label value of the sample is one of the largest elements in the feature vector. If so, the feature vector of the sample is determined to be similar to the sample label and recorded as a reliable sample. The specific process is as follows:

[0047] S21. Input a sample from the multi-class sample set into the trained first classification model, and obtain the feature vector of the sample output from the layer before the softmax layer of the first classification model.

[0048] S22. Find the two elements with the largest values in the feature vector obtained in S21 and their corresponding indices;

[0049] S23. Compare the true label of the sample with the two indices found in S22. If the true label equals either of these two indices, then the feature vector of the sample is considered similar to the label, and it is a reliable sample. The formula used is as follows:

[0050] ,

[0051] ,

[0052] in, This represents the output vector of the layer preceding the softmax layer in the model output; This represents the true label of the sample; max is the maximum value. The corresponding index value; Representing the eigenvector Subscript The value of a specific dimension; Representing the eigenvector The maximum value among all dimensions; Representing the eigenvector The second largest value among all dimensions; Representing the eigenvector Mid-dimensional The corresponding value; This represents a sample in a set of multiple classes.

[0053] S24. Collect the trustworthy samples obtained from S23 and construct a trustworthy sample set.

[0054] In some embodiments, after weighted aggregation of the first classification model and the second classification model in step S3 to obtain the final model, an iterative optimization step can be further performed to continuously improve the model performance and robustness. This iterative process starts with the reliable sample set, the first classification model, and the global model obtained in steps S1 and S2. The reliable sample set is used as the initial high-confidence sample set; based on the final model, the remaining samples not yet selected into the single-class sample set and the high-confidence sample set are predicted, and samples with similar feature vectors and sample labels are selected and added to the high-confidence sample set; the second classification model is further trained based on the updated high-confidence sample set; the first classification model and the further trained second classification model are weighted and aggregated to obtain a new global model; the above iterative steps are repeated, and the new global model generated in each aggregation is used for the next round of selection until the number of newly added samples in continuous iterations is lower than a preset threshold or the number of iterations reaches a preset upper limit; finally, the global model generated in the last round is used as the final model. The specific process is as follows:

[0055] S31. Iterative Initialization: Set the trusted sample set... This serves as the initial high-confidence sample set. The model obtained after the initial weighted aggregation of S3 is used as the global model for the current round. ;

[0056] S32. Iterative Sample Selection: Use the currently best-performing global model. ( For the iteration round index, initially =0) For the remaining samples that have not yet been included in the single-class sample set and the current high-confidence sample set. Make predictions, select samples whose feature vectors are similar to the sample labels (consistent with the selection method used in S2), and incorporate and update the high-confidence sample set;

[0057] S33, Auxiliary Model Update: Using the updated high-confidence sample set For the second classification model Continue training to obtain an updated second classification model. ;

[0058] S34, the first classification model Compared with the updated second classification model Treating them as two clients in federated learning, a weighted aggregation is performed using the FedAvg algorithm to generate a new global model. The formula used is as follows:

[0059] ,

[0060] in, These represent the parameters of the final model generated after aggregation. Indicates the first The number of local training data samples owned by each client; Indicates the total number of client samples; Indicates the first The model parameters after local training by each client; This represents the set of clients participating in this round of aggregation;

[0061] S35. Iteration Loop and Termination: Repeat steps S32 to S34. In each iteration, the performance improvement of the global model enables it to select more high-confidence sample sets, thereby training a stronger auxiliary model, and obtaining a better next-generation global model through weighted aggregation. Termination occurs when the number of newly added samples in consecutive iterations falls below a preset threshold, or when the total number of iterations reaches a preset upper limit. The iteration terminates at *. The final generation of the global model... The output is the final model. .

[0062] Furthermore, this invention employs a full client participation mechanism during the federated learning optimization process, effectively simplifying the system workflow and improving data utilization efficiency. Specifically, in the application scenario of this invention, the total number of clients participating in federated learning is limited (usually set to k≤20). Based on this scale characteristic, this method eliminates the random selection step for clients in each training iteration, instead requiring all clients to synchronously participate in model uploading and aggregation, i.e., fixing the client participation rate at 1 (100%).

[0063] Meanwhile, since all computations occur within a single trusted client, there is no risk of cross-domain privacy leakage. This invention makes a key optimization to classic federated learning: removing the security encryption steps designed to address cross-institutional privacy risks, and retaining and utilizing only its efficient computational framework for distributed model aggregation and parameter updates. This design significantly reduces the overall computational complexity and communication latency of the system without affecting the core security objectives of this scenario, greatly improving training efficiency.

[0064] In summary, this invention uses an unsupervised density clustering algorithm as a pre-filter in the client-side local data cleaning scenario of federated learning, achieving preliminary separation of potential backdoor samples and noise from the training data source. Building upon this, it is combined with subsequent model selection and dual-model federated aggregation optimization to construct a progressive defense system from data distribution to model semantics, thereby significantly improving the robustness of federated learning models against hidden backdoor attacks.

[0065] See Figure 2This invention provides a backdoor defense system based on clustering and weighted aggregation, comprising:

[0066] The sample classification module 100 is used to perform cluster analysis on the input samples based on the density clustering algorithm, and divide the input samples into single-class sample sets and multi-class sample sets according to the clustering results;

[0067] The model training module 200 is used to train a first classification model based on a single-class sample set; predict multi-class sample sets based on the first classification model, select samples whose feature vectors are similar to the sample labels, and construct a reliable sample set; and train a second classification model based on the reliable sample set.

[0068] The model aggregation module 300 is used to perform weighted aggregation of the first classification model and the second classification model to obtain the final model.

[0069] In another aspect, the present invention provides a computer device including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor, when executing the computer program, implements the backdoor defense method based on clustering and weighted aggregation as described in any of the above claims.

[0070] In another aspect, the present invention provides a readable storage medium storing a computer program that can be executed by a processor of the device in which the storage medium is located, to implement the backdoor defense method based on clustering and weighted aggregation as described in any of the above claims.

[0071] Those skilled in the art will understand that the logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can mean any means that can contain stored, communicated, propagated, or transmitted programs for use by, or in conjunction with, an instruction execution system, apparatus, or device.

[0072] More specific examples of computer-readable media (a non-exhaustive list) include: electrical connections (electronic devices) having one or more wires, portable computer disk drives (magnetic devices), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which the program can be printed, because the program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.

[0073] It should be understood that various parts of the present invention can be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.

[0074] While embodiments of the present invention have been described in detail above, it will be apparent to those skilled in the art that various modifications and variations can be made to these embodiments. However, it should be understood that such modifications and variations fall within the scope and spirit of the invention as set forth in the claims. Furthermore, the invention described herein may have other embodiments and can be implemented or carried out in various ways.

Claims

1. A backdoor defense method based on clustering and weighted aggregation, characterized in that, include: The input samples are clustered based on the density clustering algorithm, and the input samples are divided into single-class sample sets and multi-class sample sets according to the clustering results. A first classification model is trained based on the single-class sample set; predictions are made on the multi-class sample set based on the first classification model, and samples with similar feature vectors and sample labels are selected to construct a reliable sample set; A second classification model is trained based on the aforementioned reliable sample set; The first classification model and the second classification model are weighted and aggregated to obtain the final model.

2. The method as described in claim 1, characterized in that, Before performing cluster analysis on the input data, the input data is preprocessed by normalization.

3. The method as described in claim 1, characterized in that, When performing cluster analysis on input samples based on density clustering algorithms, the density clustering algorithm includes a density-based noise-applied spatial clustering algorithm, which defines core points by setting a minimum number of points in the neighborhood parameter.

4. The method as described in claim 3, characterized in that, When dividing the input samples into single-class sample sets and multi-class sample sets based on the clustering results, the following steps are included: classifying sample points that belong to and belong only to the same core cluster into single-class sample sets; classifying sample points that are identified as noise points by the density clustering algorithm or boundary points belonging to two or more clusters into multi-class sample sets.

5. The method as described in claim 1, characterized in that, When predicting the multi-class sample set based on the first classification model and selecting samples whose feature vectors are similar to the sample labels, the process includes: for samples in the multi-class sample set, obtaining the feature vectors output by the first classification model; determining whether the true label of the sample is one of the category indices corresponding to the two elements with the largest values in the feature vector; if so, determining that the feature vector of the sample is similar to the sample label and recording it as a reliable sample.

6. The method as described in claim 1, characterized in that, The weighted aggregation uses a federated average algorithm; the aggregation weights of the federated average algorithm are dynamically determined based on the ratio of the number of samples in the single-class sample set and the trusted sample set.

7. The method as described in claim 1, characterized in that, After obtaining the final model, the process also includes model iterative optimization steps: Step 1: Use the aforementioned credible sample set as the initial high-confidence sample set; Step 2: Based on the final model, predict the remaining samples that have not yet been selected into the single-class sample set and the high-confidence sample set, and filter out samples whose feature vectors and sample labels are similar to those of the samples, and add them to the high-confidence sample set. Step 3: Continue training the second classification model based on the updated high-confidence sample set; then perform a weighted aggregation of the first classification model and the retrained second classification model to obtain a new global model. Step 4: Repeat steps 2 to 3, and use the new global model generated in each aggregation for the next round of screening, until the number of new samples added in the continuous iteration is lower than the preset threshold or the number of iterations reaches the preset upper limit; finally, the global model generated in the last round is used as the final model.

8. A backdoor defense system based on clustering and weighted aggregation, characterized in that, include: The sample classification module is used to perform cluster analysis on the input samples based on the density clustering algorithm, and divide the input samples into single-class sample sets and multi-class sample sets according to the clustering results; The model training module is used to train a first classification model based on the single-class sample set; and to predict the multi-class sample set based on the first classification model, thereby selecting samples whose feature vectors are similar to the sample labels and constructing a reliable sample set. A second classification model is trained based on the aforementioned reliable sample set; The model aggregation module is used to perform weighted aggregation on the first classification model and the second classification model to obtain the final model.

9. A computer device, characterized in that, include: Memory, processor, and bus system; The memory is used to store programs; The processor is used to execute the program in the memory, and the processor is used to execute the backdoor defense method based on clustering and weighted aggregation according to the instructions in the program code; The bus system is used to connect the memory and the processor to enable communication between the memory and the processor.

10. A storage medium, characterized in that, The storage medium stores one or more programs that, when executed by a processor, implement a backdoor defense method based on clustering and weighted aggregation as described in any one of claims 1 to 6.