A heterogeneous federated learning system and method resistant to data corruption
By employing data augmentation and KL divergence computation in federated learning to combat data corruption, we address the model training instability caused by client-side data corruption, improve the model's robustness and stability, and reduce communication costs and the impact of data inconsistency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUHAN UNIV
- Filing Date
- 2023-09-05
- Publication Date
- 2026-06-23
AI Technical Summary
In federated learning, corrupted client data leads to instability and convergence issues in model training, which existing methods struggle to address effectively.
We employ a heterogeneous federated learning system and method that is resistant to data corruption. Through local update and collaborative update phases, combined with data augmentation, KL divergence and confidence calculation, we optimize the model training process and reduce the impact of data corruption.
It improves the robustness and stability of the model, reduces model training fluctuations and non-convergence, reduces communication costs and the impact of data inconsistency, and improves the overall efficiency of the federated learning system.
Smart Images

Figure CN117313889B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of machine learning technology and relates to a federated learning system and method, specifically a heterogeneous federated learning system and method resistant to data corruption. Background Technology
[0002] Federated learning is a distributed learning paradigm that enables multiple clients to collaboratively train a model while preserving their privacy. Existing implementations of federated learning often rely on a uniform distribution of local models, typically by averaging model parameters across clients. Many methods also focus on handling model heterogeneity and knowledge distillation, but these often assume perfect training data, ignoring potential data corruption issues on the client side.
[0003] There are many methods in machine learning to deal with data corruption. These methods can be divided into the following main categories:
[0004] (1) Data augmentation: Enhance the robustness of the model and reduce the generalization error by augmenting corrupted data.
[0005] (2) Noise injection: This prevents the model from overfitting corrupted data by adding noise to one or more parts of the training process.
[0006] (3) Pre-training: Pre-training on diverse datasets with large domain gaps can also help improve the robustness of the model.
[0007] The aforementioned methods for addressing data corruption are primarily applied in centralized training and uniform federated learning. However, in federated learning, especially in handling model heterogeneity, the potential for corrupted data from individual clients can lead to instability and convergence issues during model training.
[0008] In summary, in order to achieve federated learning with heterogeneous models, it is crucial to address and overcome the challenges posed by corrupted client data and reduce the instability of model training and convergence. Summary of the Invention
[0009] To address the aforementioned technical problems, this invention discloses a heterogeneous federated learning system and method resistant to data corruption. This system can simultaneously handle data corruption issues both between and within clients, preventing model training instability and improving model robustness. Furthermore, it dynamically adjusts the contribution of each client in communication based on their ability to handle data corruption, thereby reducing the impact of data corruption.
[0010] The technical solution adopted by the system of this invention is: a heterogeneous federated learning system resistant to data corruption, comprising a federated learning system with K clients and one server; each client k has a private dataset. in, Representative sample image, |x k |=N k N k Indicates the size of the private dataset. A one-hot vector representing the true label; the client never shares the D with the server or other clients. k Each client has a unique local model f(θ). k ), where θ k This represents the model parameters, while f(x) k ,θ k ) represents x k In θ k The output calculated above; each client has a corrupted private dataset. This represents potentially corrupted sample images; the server has an unlabeled public dataset. N0 represents the size of the private public dataset, which is accessible to all clients and may contain corrupted images.
[0011] The technical solution adopted by the method of the present invention is: a heterogeneous federated learning method resistant to data corruption, applied to a heterogeneous federated learning system resistant to data corruption; including two stages: local update and collaborative update;
[0012] The local update is specifically implemented by the following steps:
[0013] Step A1: Perform random augmentation on the original local image data x to obtain the augmentation sequence;
[0014] Step A2: Perform a weighted average on the enhanced sequences to obtain the enhanced image data;
[0015] Step A3: Add the enhanced image data to the original local image data to obtain the mixed image data;
[0016] Step A4: Local model f(θ) k )renew;
[0017] The collaborative update is specifically implemented by the following steps:
[0018] Step B1: Use the local model f(θ) on each client k For the prediction results of the public dataset, calculate the KL divergence value between any two clients for any pair of clients;
[0019] Step B2: Calculate the client's loss value in this round of joint update phase based on the sum of the KL divergence values between the client and all other clients.
[0020] Step B3: Based on each client's local model f(θ) k ), thus obtaining the confidence level R of the corresponding client. k ;
[0021] Step B4: Based on the loss value and confidence level R k During the joint update phase, the local models f(θ) of the heterogeneous clients are updated. k Update it.
[0022] Preferably, in step A1, the enhancement operations include automatic contrast, equalization, color separation, rotation, brightening, horizontal shearing, vertical shearing, horizontal translation, or vertical translation; three enhancement operations are selected for local image data x and stacked to obtain the enhancement operation sequence Seq(x).
[0023] Seq(x)~{a1(x),a 12 (x),a 123 (x)};
[0024] a 12 (x)=a1(x)⊕a2(x),a 123 (x)=a1(x)⊕a2(x)⊕a3(x);
[0025] Where a1(x), a2(x), and a3(x) represent three randomly sampled enhancement operations.
[0026] Preferably, in step A2, a series of random weights sampled from the Dirichlet random distribution are weighted and averaged with the enhancement sequence;
[0027]
[0028] Where S represents the number of enhancement sequences, w i Seq represents the random weights sampled randomly. i (x) represents the i-th operation enhancement sequence.
[0029] Preferably, in step A3, the enhanced image data x seq Add the original local image data x to obtain the mixed image data x. aug ;
[0030] x aug =η·x + (1-η)·x seq ;
[0031] Where η represents a random number sampled from a beta distribution.
[0032] As a preferred embodiment, step A4 includes the following sub-steps:
[0033] Step A4.1: For each original local image data, obtain two sets of mixed image data; using the original local image data x and the two sets of mixed data x aug1 and x aug2 Calculate the JS loss function l JS and further l JS Compared with the original cross-entropy loss function l ce Adding them together, we get the local loss function l. local ;
[0034] l local =l ce (f,y)+μ·l Js (f,f′,f″);
[0035] Where f, f′, f″ represent the values from x, x aug1 and x aug2 The predicted logits values are calculated after the three types of data are input into the local model. y represents the original data label, and μ is a hyperparameter that controls the strength of the JS consistency constraint.
[0036] Step A4.2: Based on the local loss function l local For the local model f(θ) on the k-th client k Perform gradient descent updates;
[0037]
[0038] in, It represents t l The k-th model f(θ) in the round k The parameter λ represents the local learning rate. y represents the k-th sample image that may be corrupted. k This represents the given label corresponding to the k-th sample. Indicates the t-th l In round -1, the k-th sample image is input into the local model of the k-th client. The prediction results obtained in the middle, Indicates the t-th l -1 round local model The update gradient.
[0039] Preferably, in step B1, the difference in local knowledge distribution among clients is measured by calculating the KL divergence value among clients. Where N0 represents the size of the public dataset, and p and q represent the label class distribution and the predicted class distribution, respectively. Let represent the label class distribution and the predicted class distribution of the i-th public sample image that may be corrupted, respectively.
[0040] Preferably, in step B2, the loss value of a client in this round is calculated based on the sum of the KL divergence values between a client and all other clients. Where K represents the total number of clients, k represents the k-th client, and t c Indicates the current training cycle. It represents t c In-wheel local model The predicted distribution on the public dataset D0, Indicates t c In-wheel local model The predicted distribution on the public dataset D0.
[0041] Preferably, in step B3, the confidence level of each client is calculated. in, Corrupted public datasets and In the local model θ k The predicted distribution on, For the original public dataset D0 in the local model θ k The predicted distribution on.
[0042] Preferably, in step B4, the local models of each client are updated during the collaborative update phase. Where λ represents the local learning rate, W k Indicates the confidence level R k The weight values obtained by normalization calculation This indicates the weighted model update.
[0043] The innovations of this invention include:
[0044] (1) The present invention uses random weights sampled by Dirichlet random distribution to combine with enhancement sequences, which can balance the intensity of enhancement operations and prevent excessive enhancement from damaging image information.
[0045] (2) The present invention adds the original data and the enhanced data to produce mixed data, which not only preserves the basic information of the original data, but also introduces the diversity of enhancement operations and enhances the generalization ability of the model.
[0046] (3) By combining the JS loss function and the cross-entropy loss function, this invention can optimize the model accuracy while ensuring the robustness of the model.
[0047] (4) The present invention updates the model parameters according to the gradient descent method, which can ensure the efficiency of model optimization and make full use of the information obtained in each iteration.
[0048] (5) By calculating the KL divergence value, this invention can intuitively evaluate the similarity between various client models and provide a reference for subsequent steps.
[0049] (6) By calculating the loss value of the client in the joint update phase, this invention can provide a dynamic adjustment basis for model optimization.
[0050] (7) By calculating the confidence level of each client model, this invention can reflect the relative quality of each client model and provide a basis for decision-making in subsequent steps.
[0051] The present invention has the following advantages:
[0052] (1) This invention effectively combats the data corruption problem by utilizing an enhanced local training strategy, which does not require an additional centralized data repair or cleaning process, saving a lot of preprocessing time and computing resources.
[0053] (2) This invention optimizes the stability of model training in federated learning environments, preventing model training fluctuations and non-convergence caused by data corruption. At the same time, by combining the JS loss function and the cross-entropy loss function, the sensitivity of the model to corrupted data is significantly reduced.
[0054] (3) To address the complexity and convergence issues of heterogeneous federated learning, this invention designs an adaptive weight allocation scheme based on KL divergence and confidence. This scheme adaptively allocates corresponding weights to each client during the collaborative learning phase, which significantly reduces the communication costs and the impact of data inconsistency during model training and improves the overall efficiency of the federated learning system. Attached Figure Description
[0055] The technical solutions described herein are further illustrated below using examples and specific implementation methods. Additionally, accompanying drawings are used in the description of the technical solutions. Those skilled in the art can, without any creative effort, obtain other drawings and the intent of the present invention based on these drawings.
[0056] Figure 1 This is a schematic diagram illustrating how corrupted client data interferes with the federated learning process in an embodiment of the present invention;
[0057] Figure 2This is a schematic diagram of the method flow of an embodiment of the present invention. Detailed Implementation
[0058] To facilitate understanding and implementation of the present invention by those skilled in the art, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments described herein are for illustration and explanation only and are not intended to limit the present invention.
[0059] Please see Figure 1 This embodiment provides a heterogeneous federated learning system resistant to data corruption, comprising K clients and one server. Each client has its own private dataset D. k It contains N k This is one example. To protect privacy, the client never shares D with the server or other clients. k Furthermore, each client has a unique local model f(θ). k ), where θ k This represents the model parameters. For this study, each client has a corrupted private dataset. Each image is either clean or corrupted. The server has an unlabeled public dataset D0, accessible to all clients, which may contain corrupted images.
[0060] Please see Figure 2 This embodiment provides a heterogeneous federated learning method resistant to data corruption, comprising two phases: local update and collaborative update. In the client update phase, each client's local model is first pre-trained on its respective private dataset, followed by an appropriate number of local learning cycles to balance local knowledge with knowledge from other clients. In the collaborative update phase, a certain number of collaborative updates are performed, involving all clients. The Adam optimizer is used for learning, with a set learning rate and batch size. Based on the calculated confidence level, the participants are reweighted to adjust the learning direction to cope with data corruption.
[0061] This embodiment selects the Cifar-10-C and Cifar-100 datasets. Cifar-10-C was obtained by introducing common visual impairments into Cifar-10, while Cifar-100 was selected as a public dataset on the server.
[0062] For heterogeneous models, this embodiment assigns four different local models, ResNet10, ResNet12, ShuffleNet, and Mobilenetv2, to four clients respectively.
[0063] This embodiment constructs damaged data using the method of Hendrycks et al. Cifar-10-C includes 15 damage types (from four main categories: noise, blur, weather, and numbers), each with five severity levels, resulting in 75 different types of damage. Damaged images are generated by randomly sampling damage types and severity levels from a uniform distribution. This embodiment randomly sets different damage rates, damage types, and severity levels for different clients' private datasets.
[0064] Specifically, the following steps are included:
[0065] Step 1: During the local update process, random enhancement operations are first performed on the local image data x. Enhancement operations include automatic contrast, equalization, color separation, rotation, brightening, horizontal cropping, vertical cropping, horizontal translation, and vertical translation.
[0066] In one implementation, three enhancement operations are selected from local image data x and stacked together to obtain an enhancement operation sequence Seq(x);
[0067] Seq(x)~{a1(x),a 12 (x),a 123 (x)};
[0068] a 12 (x)=a1(x)⊕a2(x),a 123 (x)=a1(x)⊕a2(x)⊕a3(x);
[0069] Where a1(x), a2(x), and a3(x) represent three randomly sampled enhancement operations.
[0070] Step 2: To prevent image information degradation due to over-enhancement, this embodiment uses a weighted average of a series of random weights sampled from a Dirichlet random distribution and the enhancement sequence.
[0071]
[0072] Where S represents the number of enhancement sequences, w i Seq represents the random weights sampled randomly. i (x) represents the i-th operation enhancement sequence.
[0073] Step 3: In this embodiment, the enhanced data is added to the original data in the following manner to obtain a mixed data:
[0074] x aug =η·x + (1-η)·x seq ;;
[0075] Where η represents a random number sampled from a beta distribution.
[0076] Step 4: Update the model based on the data obtained from the above operations.
[0077] Step 4.1: For each local data set, repeat the above operation twice to obtain two sets of mixed data; using the original local image data x and the two sets of mixed data x aug1 and x aug2 Calculate the JS loss function l JS , and further l JS Compared with the original cross-entropy loss function l ce Adding them together, we get the local loss function l. local ;
[0078] l local =l ce (f,y)+μ·l JS (f,f′,f″);
[0079] Where f, f′, f″ represent the values from x, x aug1 and x aug2 The predicted logits values are calculated after the three types of data are input into the local model. y represents the original data label, and μ is a hyperparameter that controls the strength of the JS consistency constraint.
[0080] Step 4.2: Apply this loss function to the local model f(θ) on the k-th client. k Perform gradient descent updates:
[0081]
[0082] in, It represents t l The k-th model f(θ) in the round k The parameter λ represents the local learning rate. y represents the k-th sample image that may be corrupted. k This represents the given label corresponding to the k-th sample. Indicates the t-th l In round -1, the k-th sample image is input into the local model of the k-th client. The prediction results obtained in the middle, Indicates the t-th l -1 round local model The update gradient.
[0083] Step 5: In the joint update phase, this embodiment measures the difference in local knowledge distribution among clients by calculating the KL divergence value between them. First, using the prediction results of each client's local model on the public dataset, the KL divergence value is calculated for any pair of clients:
[0084]
[0085] Where N0 represents the size of the public dataset, and p and q represent the label class distribution and the predicted class distribution, respectively. Let represent the label class distribution and the predicted class distribution of the i-th public sample image that may be corrupted, respectively.
[0086] Step 6: Based on the sum of the KL divergence values between the client and all other clients, this example can calculate the client's loss value in this round of joint update phase:
[0087]
[0088] Where K represents the total number of clients, k represents the k-th client, and t c Indicates the current training cycle. It represents t c In-wheel local model The predicted distribution on the public dataset D0, Indicates t c In-wheel local model The predicted distribution on the public dataset D0.
[0089] Step 7: Simultaneously, based on the model for each client, the confidence level for that client can be obtained:
[0090]
[0091] in, Corrupted public datasets and In the local model θ k The predicted distribution on, For the original public dataset D0 in the local model θ k The predicted distribution on.
[0092] Step 8: Based on the loss value and confidence level R k During the joint update phase, the local models f(θ) of the heterogeneous clients are updated. k Update:
[0093]
[0094]
[0095] Where λ represents the local learning rate, W k Indicates the confidence level R k The weight values obtained by normalization calculation This indicates the weighted model update.
[0096] The invention will be further illustrated by specific experiments below.
[0097] This experiment used the Cifar-10-C and Cifar-100 datasets. The Cifar-10 dataset consists of 60,000 32x32 pixel color images, including 50,000 training images and 10,000 test images, covering 10 different categories. Similarly, the Cifar-100 dataset also contains 60,000 32x32 pixel color images, but includes 100 categories, with 600 images per category, of which 500 are used as training images and 100 as test images. In the experiment, Cifar-10-C was randomly allocated to customers as a private dataset, and a subset of Cifar-100 was selected as the public dataset on the server. The sizes of the private and public datasets were set to $N_k = $10,000 and $N_0 = $5,000, respectively.
[0098] To generate the noise dataset, this experiment constructed jointly learned noise patterns based on the method of Hendrycks et al. Cifar-10-C includes 15 noise types, each with five severity levels, resulting in a total of 75 different noise patterns. This experiment randomly set different noise rates, noise types, and severity levels for different client-specific private datasets.
[0099] In this experiment, the number of collaborative learning rounds of the model is set to T. c =40 rounds. Considering the balance of dataset size, the number of local learning rounds is set to... This balances local knowledge with knowledge from other clients. Furthermore, this experiment uses the Adam optimizer with an initial learning rate of λ = 0.001 and a batch size of 256. The number of data augmentation operation sequences... In this experiment, the noise rate ξ was set to 3, and the hyperparameter μ was set to 12. Furthermore, the noise rate ξ was set to 0, 0.5, and 1, representing that the private dataset was clean, partially corrupted, and completely corrupted, respectively.
[0100] To demonstrate the effectiveness of this invention in heterogeneous model scenarios, this experiment compares it with state-of-the-art methods, including FedMD, FedDF, RHFL, and FCCL. FedMD is a method for communication based on the average class scores output by client models on public datasets. FedDF is a distillation framework that leverages unlabeled or artificially generated data for powerful joint model fusion. RHFL handles label noise and heterogeneous model communication simultaneously within a single framework. FCCL constructs a cross-correlation matrix for collaborative learning and prevents catastrophic forgetting during local updates.
[0101] Table 1: Accuracy Comparison Table with Image Defect Rate of 0.5
[0102] Model <![CDATA[θ0]]> <![CDATA[θ1]]> <![CDATA[θ2]]> <![CDATA[θ3]]> Avg <![CDATA[θ0]]> <![CDATA[θ1]]> <![CDATA[θ2]]> <![CDATA[θ3]]> Avg Baseline 68.75 68.46 57.14 57.41 62.94 64.03 64.75 50.86 52.40 58.01 FedMD 64.25 65.30 55.97 58.58 61.03 59.66 61.28 51.66 54.56 56.79 FedDF 61.72 63.51 57.29 57.89 60.10 58.09 59.57 53.15 54.35 56.29 RHFL 58.05 59.47 51.13 55.07 55.93 62.42 63.46 56.78 59.81 60.62 FCCL 63.75 62.62 58.43 59.78 61.15 59.91 59.03 53.71 56.24 57.22 This invention 76.22 76.50 66.66 73.31 73.17 71.96 71.26 61.28 69.58 68.52
[0103] Table 2: Accuracy Comparison Table with Image Defect Rate of 1
[0104] Model <![CDATA[θ0]]> <![CDATA[θ1]]> <![CDATA[θ2]]> <![CDATA[θ3]]> Avg <![CDATA[θ0]]> <![CDATA[θ1]]> <![CDATA[θ2]]> <![CDATA[θ3]]> Avg Baseline 65.58 67.01 56.00 57.11 61.43 62.47 63.88 52.57 53.49 58.10 FedMD 62.11 63.74 56.57 58.12 60.14 59.11 60.58 52.51 55.39 56.90 FedDF 61.66 63.11 58.39 58.06 60.31 58.87 59.45 55.44 55.31 57.27 RHFL 62.55 64.21 57.57 58.14 60.62 58.14 59.79 54.75 55.77 57.11 FCCL 62.60 63.64 58.60 59.71 61.14 59.36 59.55 55.82 56.92 57.91 This invention 76.98 77.82 65.79 74.03 73.66 73.65 74.04 61.50 70.92 70.03
[0105] As shown in Tables 1 and 2, this invention outperforms existing strategies under various corruption settings. When the noise rate increases from 0.1 to 0.2, the average test accuracy of FedMD, FedDF, RHFL, and FCCL significantly decreases. This demonstrates that this invention is robust to different noise settings and capable of reducing noise impact in more complex noise scenarios. This is mainly due to two reasons: 1. This invention applies a robust adversarial data corruption loss function during the local training phase of the model. 2. The client confidence reweighting scheme proposed in this invention during the collaborative training phase effectively reduces the impact of noise from other clients.
[0106] It should be understood that the above description of the preferred embodiments is quite detailed, but it should not be considered as a limitation on the scope of protection of this invention. Those skilled in the art, under the guidance of this invention, can make substitutions or modifications without departing from the scope of protection of the claims of this invention, and all such substitutions or modifications fall within the scope of protection of this invention. The scope of protection of this invention should be determined by the appended claims.
Claims
1. A heterogeneous federated learning method resistant to data corruption, applied to a heterogeneous federated learning system resistant to data corruption; characterized in that: The system has A federated learning system with one client and one server; each client k They all have a private dataset ,in, Representative sample image, , Indicates the size of the private dataset. A one-hot vector representing the real label; the client never shares it with the server or other clients. Each client has a unique local model. ,in This represents the model parameters, while express exist The output calculated above; each client has a corrupted private dataset. , This represents potentially corrupted sample images; the server has an unlabeled public dataset. , This indicates the size of the private public dataset, which is accessible to all clients and may contain corrupted images. The method includes two phases: local update and collaborative update. The local update is specifically implemented by the following steps: Step A1: Process the raw local image data Perform random augmentation operations to obtain augmented sequences; Step A2: Perform a weighted average on the enhanced sequences to obtain the enhanced image data; Step A3: Add the enhanced image data to the original local image data to obtain the mixed image data; Step A4: Local Model renew; The collaborative update is specifically implemented by the following steps: Step B1: Use the local model on each client For the prediction results of the public dataset, calculate the KL divergence values between any two clients for any pair of clients; Step B2: Calculate the client's loss value in this round of joint update phase based on the sum of the KL divergence values between the client and all other clients. ; Step B3: Based on the local model of each client The confidence level of the corresponding client is obtained. ; Step B4: Based on the loss value and confidence level During the joint update phase, the local models of the heterogeneous clients are updated. Update.
2. The heterogeneous federated learning method resistant to data corruption according to claim 1, characterized in that: In step A1, enhancement operations include automatic contrast, equalization, color separation, rotation, brightening, horizontal cropping, vertical cropping, horizontal translation, or vertical translation of local image data. Three augmentation operations are stacked to obtain an augmentation operation sequence. ; in, These represent three enhancement operations randomly sampled.
3. The heterogeneous federated learning method resistant to data corruption according to claim 2, characterized in that: In step A2, a series of random weights sampled from the Dirichlet random distribution are weighted and averaged with the enhancement sequence; in, This represents the number of enhancement sequences. Represents the random weights sampled randomly. Indicates the first i An operation to enhance the sequence.
4. The heterogeneous federated learning method resistant to data corruption according to claim 3, characterized in that: In step A3, the enhanced image data With raw local image data Add the images together to obtain the mixed image data. ; in, This represents a random number sampled from a beta distribution.
5. The heterogeneous federated learning method resistant to data corruption according to claim 4, characterized in that, The specific implementation of step A4 includes the following sub-steps: Step A4.1: For each original local image data, obtain two mixed image data sets; using the original local image data... and two sets of mixed data and Calculate the JS loss function and further Compared to the original cross-entropy loss function Add them together to obtain the local loss function. ; in, They represent from , and The predicted logits values are calculated after inputting the three types of data into the local model. The label represents the original data, and µ is a hyperparameter that controls the strength of the JS consistency constraint. Step A4.2: Based on the local loss function For the first Local model on each client Perform gradient descent updates; ; in, It represents Wheel of Life k A model The parameters, Indicates the local learning rate. The first one that may be damaged k One sample image, Indicates the first k Given a label for each sample Indicates the first In the round, the first k The first sample image was input into the first... k Local model for each client The prediction results obtained in the middle, Indicates the first In-wheel local model The update gradient.
6. The heterogeneous federated learning method resistant to data corruption according to claim 5, characterized in that: In step B1, the difference in local knowledge distribution among clients is measured by calculating the KL divergence value among clients. ;in Indicates the size of the public dataset. p and q They respectively refer to the label distribution and the prediction distribution. These represent the numbers that may be damaged. i The label class distribution and predicted class distribution of a public sample image.
7. The heterogeneous federated learning method resistant to data corruption according to claim 6, characterized in that: In step B2, the loss value of a client in this round is calculated based on the sum of the KL divergence values between a client and all other clients. ;in, This represents the total number of clients, where k represents the k-th client. Indicates the current training cycle. express In-wheel local model In public datasets The predicted distribution on, express In-wheel local model In public datasets The predicted distribution on.
8. The heterogeneous federated learning method resistant to data corruption according to claim 7, characterized in that: In step B3, the confidence level for each client is calculated. ;in, Corrupted public datasets and In the local model The predicted distribution on, For the original public dataset In the local model The predicted distribution on.
9. The heterogeneous federated learning method resistant to data corruption according to claim 8, characterized in that: In step B4, the local models of each client are updated during the collaborative update phase. , ;in, Indicates local learning rate, Indicates confidence level The weight values obtained by normalization calculation This indicates the weighted model update.