A privacy-protected optimal transmission federated learning method for locally personalized updating of client model parameters

By updating client-side model parameters locally and individually, this method addresses the poor model performance issues caused by uneven data distribution and heterogeneity in federated learning. It enables safe and efficient model training without data transfer, thereby improving the model's generalization ability and accuracy.

CN122242647APending Publication Date: 2026-06-19SHANGHAI DIANJI UNIV +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANGHAI DIANJI UNIV
Filing Date
2024-06-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In federated learning, the uneven distribution of client data, data heterogeneity, and different data holders prevent the model from achieving optimal performance, resulting in poor performance and poor generalization ability of the trained global model.

Method used

A method of locally personalized updating client model parameters is adopted. The client performs partial personalized updates based on the parameter differences between the local model and the global model, and differential privacy noise is added before uploading. The server performs optimal transmission alignment and model fusion to ensure the security and accuracy of model training.

Benefits of technology

It improves the model's generalization ability and accuracy, ensures the safety of model training, and is suitable for effective model training in distributed scenarios with multiple clients and servers.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242647A_ABST
    Figure CN122242647A_ABST
Patent Text Reader

Abstract

This invention relates to the field of federated learning and discloses a privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters. The method's operating apparatus includes multiple clients and one server. First, each client receives the global model and global model parameters from the server. The client then trains and updates its local model using a local dataset. Next, based on the parameter differences between the local and global models, the client randomly and proportionally updates the local model parameters. The client adds differential privacy noise to the locally personalized model parameters and uploads them to the server. Then, the server performs optimal transmission alignment on the uploaded locally personalized model parameters to generate the server's global model parameters. The accuracy of the global model is assessed. If it does not reach the optimal level or exceed a threshold, the global model parameters continue to be sent to the clients for random updates after the next round of parameter training. Finally, training and updating stop when the accuracy of the global model reaches the optimal level or exceeds the threshold. This invention primarily addresses the accuracy issues in distributed training of federated learning when datasets from multiple local clients do not satisfy the independent and identically distributed (i.i.d.) characteristics, and when data heterogeneity is a concern. It helps improve the generalization ability and accuracy of trained models while protecting data privacy. By keeping some client-side training parameters unchanged before the global update, it preserves more model parameters derived from local data, thus effectively adapting to privacy-preserving federated learning training on heterogeneous datasets from multiple clients. This method is applicable to distributed training of most deep learning models.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of federated learning, and more particularly to a privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters. Background Technology

[0002] Common deep machine learning techniques primarily utilize advanced deep network models, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Generative Adversarial Networks (GANs), to learn and extract features from large amounts of data, thereby achieving various functions such as classification, recognition, and data generation. These network models can provide intelligent solutions for applications across various industries. However, in the actual training and production of these models, there are still centralized training problems such as difficulties in data transmission and storage. Federated learning, a distributed model training method, has emerged to address these challenges.

[0003] In the field of deep learning, due to limitations in computing power and memory for data communication between clients, practical model training often encounters problems such as difficulty in communicating heterogeneous data among clients and low training accuracy for individual clients. By leveraging federated learning, distributed clients can train local models on their local machines without uploading local data. They then upload the local model parameters to the server, which aggregates the local model parameters and generates the global model. The aggregated global model is then sent back to the client to participate in the next round of data training, thus iteratively finding the optimal model.

[0004] In the process of implementing federated learning, the data acquisition devices come from different locations and devices, resulting in significant data heterogeneity. This data heterogeneity leads to large parameter perturbations in the global model participating in model fusion, preventing the training process from effectively converging to the optimum. When the data is not independent and identically distributed, the local data distribution used by the client for training differs from the data distribution that the global model should follow, resulting in large differences in the parameters of the local models participating in aggregation and reducing the performance of the aggregated global model. Federated learning also faces the problem of sensitive information leakage during model parameter transmission, making it easy for data pollution from individual clients in the network to spread to the global model. Therefore, privacy protection measures need to be added during training.

[0005] Existing federated learning patents offer various model fusion strategies, such as simple averaging, weighted averaging, and feature fusion. These techniques are mostly aimed at solving basic distributed training problems. However, existing technologies do not comprehensively consider the uneven distribution of samples from complex data sources, the heterogeneity of data with different data types, and the privacy protection issues of different data holders. This leads to suboptimal global model performance, poor generalization ability, and susceptibility to local data contamination. To comprehensively address these issues, a federated learning method with better adaptability and security is urgently needed. Summary of the Invention

[0006] The technical problem that this invention aims to solve is that federated learning cannot achieve optimal model performance when client data is unevenly distributed, data is heterogeneous, and data holders are different. The trained global model has poor performance and poor generalization ability.

[0007] To address the aforementioned technical problems, this invention provides a privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters, comprising the following steps.

[0008] Step 1: The client receives the global model parameters from the server.

[0009] Step 2: Based on the parameter differences between the local model and the global model, the client randomly updates some personalized parameters of the local model in proportion.

[0010] Step 3: The client uses the local dataset to train the local model and update its parameters.

[0011] Step 4: The client adds differential privacy noise to the locally personalized updated model parameters and then uploads them to the server.

[0012] Step 5: The server performs optimal transmission alignment on the uploaded local personalized model parameters, and performs model fusion based on the activation weight parameters to obtain the global model parameters on the server.

[0013] Step 6: Determine the accuracy of the global model after updating the parameters, and evaluate the performance on the validation dataset. If the global model accuracy does not reach the optimal level or does not exceed the threshold, the global model parameters will continue to be sent to the client to participate in the random update after the next round of parameter training.

[0014] Step 7: Stop training and updating when the global model training accuracy reaches the optimal level or exceeds the threshold.

[0015] Step 2, in which each client randomly updates the parameters of the local model proportionally based on the parameter differences between the local model and the global model, includes the following steps.

[0016] Step 2.1, according to Calculate the sum of the inner products of the updated gradient and weight differences. .in It is the accumulated value of the inner product of the current gradient and weight difference, and its initial value is 0. It is the global model number Layer weights Compared with the local model Layer weights The differences between them, according to calculate. It is the reference gradient for personalized updates of the parameter part, according to calculate. It is the local model number 1 The gradient of the layer, initially set to all zeros. It is the global model number The gradient of the layer is initialized to all zeros. These are the mixed parameters used to update the model, and their initial value is 0.

[0017] Step 2.2, according to Calculate the updated mixture parameters ,in The learning rate set for training the local model.

[0018] Step 2.3, according to Calculate all weights of the updated local model, where It is the current client-normalized random selection matrix, used to select local parameters to participate in the update and realize the update of some personalized model parameters.

[0019] The client-side local dataset in step 3 can undergo some preprocessing based on the characteristics of the local data and the training algorithm used, such as contrast enhancement, saturation enhancement, random cropping, random rotation, and small sample data feature annotation.

[0020] Step 4, in which each client adds differential privacy noise to the personalized updated local model parameters and uploads them to the server, includes the following steps.

[0021] Step 4.1, Define the privacy budget Used to control the risk of privacy leaks, the value range is: .

[0022] Step 4.2: Calculate the sensitivity based on the maximum absolute difference of the function outputs on adjacent datasets. This is used to measure the maximum impact of a change in a single data item on the output. This refers to the forward function of the local model after personalized updates. and It is any adjacent dataset.

[0023] Step 4.3, according to Calculate noise level This determines the amount of noise to add to the output. It's a privacy budget. It's about sensitivity.

[0024] Step 4.4, according to After calculating and adding differential privacy noise, the personalized local model is uploaded. ,in, These are the locally updated model parameters.

[0025] Step 5 involves the server performing optimal transmission alignment on the uploaded local personalized model parameters, and performing model fusion based on the activation weight parameters to obtain the server's global model parameters, including the following steps.

[0026] Step 5.1, select any two local models and set them as... , Set up forward propagation hooks for the two local models. Obtain and save the activation values ​​of each layer in both models separately. Obtain the probability weights of neuron alignment in each layer of the local model. , ,in These are the neuron layers to be arranged. It is the client identifier.

[0027] Step 5.2, according to and definition , A measure of probabilistic similarity between two models. Among them, , They are , The two models are supported by metrics such as the activation function values ​​of neurons in the neuron layer.

[0028] Step 5.3 defines the distance cost between neurons based on their characteristics and activation function values. (Transpose) Measurement support ,and Measurement support Matrix multiplication, combined and The distance cost matrix is ​​calculated by taking the squared norm of each point. .

[0029] Step 5.4, according to Calculate the neuron layer measurement , Optimal transport mapping between , where represents The process of calculating the optimal transport mapping with the goal of minimizing the cost function.

[0030] Step 5.5, according to Use optimal transport mapping model Neurons relative to the model Alignment, where weights For relative to the model Weights before alignment, weights For relative to the model Aligned weights.

[0031] Step 5.6, according to Calculate the updated global model parameters.

[0032] Compared with the prior art, the present invention has the following significant advantages.

[0033] (1) When multiple client models are federated learning, the present invention takes into account the heterogeneity of sample data of each local client. The fusion process of the server global model adopts the optimal transmission model fusion, which effectively improves the generalization ability of the model.

[0034] (2) When multiple client models are federated learning, the local client sample datasets are not independent and identically distributed. The client updates the local model parameters locally and randomly according to the parameter differences between the local model and the global model, which effectively improves the accuracy of the optimal model.

[0035] (3) When multiple client models are federated learning, and the local client holders are different, the client data is uploaded to the server after adding differential privacy noise, which effectively ensures the security of model training.

[0036] (4) The method proposed in this invention is applicable to distributed scenarios with multiple clients and one server. Multiple clients and the server work together to achieve federated learning, which can make full use of the local data of distributed clients and complete the training of the model without transferring data permissions. Attached Figure Description

[0037] Figure 1 This is a schematic diagram of the structure of a privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters according to the present invention.

[0038] Figure 2This is a detailed flowchart and thread communication diagram of a privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters according to the present invention.

[0039] Figure 3 The network structure of the VGG11 model used in a preferred embodiment of the present invention is shown below.

[0040] Figure 4 This is an iterative graph showing the global model accuracy during the federated learning training process of the VGG11 model for detecting defects in solar panel images, as described in a preferred embodiment of the present invention. Detailed Implementation

[0041] As a preferred embodiment, the specific implementation process is as follows.

[0042] First, according to Figure 1 As shown in the schematic diagram, the privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters according to the present invention includes N clients and 1 server. In a preferred embodiment of the present invention, N=10.

[0043] This invention selects an untrained VGG11 model as the initial model and stores it on 10 clients and 1 server. According to... Figure 2 As shown, the VGG11 network structure consists of 8 convolutional layers and 3 fully connected layers. The convolutional layers use consecutive 3x3 convolutional kernels for convolution operations, and the stacking of multiple convolutional layers increases the network's depth and non-linear expressive power. The fully connected layer part of VGG11 consists of 3 fully connected layers. After each fully connected layer, a ReLU activation function is used to increase non-linearity, and a Softmax function is used for classification after the last fully connected layer.

[0044] Specifically, this includes: (1) Convolutional layers. Convolutional layers can extract various features from solar panel images, such as edges and textures. High-level features in the data are extracted by stacking convolutional layers. Convolutional layers only operate on local regions of the data, and the same convolutional kernel can be used in different locations of the data image, reducing the computational load and the risk of overfitting.

[0045] The VGG11 network used in this invention employs multiple convolutional layers to obtain deeper information from the solar panel image.

[0046] (2) Pooling layer. The pooling layer of the VGG11 model uses 2x2 max pooling with a stride of 2, which can halve the size of the feature map and reduce the amount of computation.

[0047] (3) Fully connected layers and the Softmax function. The fully connected layers of the VGG11 model use the ReLU activation function, which can improve the nonlinearity of the model.

[0048] The ReLU (Rectified Linear Unit) activation function is a simple and efficient activation function that sets all negative input values ​​to zero and leaves positive values ​​unchanged.

[0049] The Softmax function transforms the real-valued vector output from the penultimate layer into a normalized probability distribution, which serves as the output layer of the network in the solar panel image classification learning problem.

[0050] In a preferred embodiment of the present invention, the publicly available ELPV dataset is selected as the training object. This dataset contains 2624 grayscale image samples of 300x300 pixels. These images are solar panels with different degrees of degradation extracted from 44 different photovoltaic inspection images, and all images have been normalized according to size and viewing angle. Image data preprocessing is performed based on the characteristics of the local data and the input requirements of the VGG11 algorithm. The image data preprocessing steps include contrast enhancement, saturation enhancement, random cropping, random rotation, and feature annotation of small sample data.

[0051] After splitting the dataset using a non-independent, identically distributed method, it is distributed imbalancedly and randomly to 10 clients, ensuring that each client holds its own local solar panel image dataset with its own imbalanced and heterogeneous characteristics. Each client has bidirectional communication capabilities with the server. At least three clients are randomly selected to collaborate with one server during training, updating and iterating both the local and global models.

[0052] Next, according to Figure 3 As shown, the detailed process and thread communication diagram of the privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters according to the present invention are as follows. The client workflow in a preferred embodiment of the present invention includes the following steps.

[0053] Step 1: Each of the 10 clients receives the VGG11 global model parameters from the server.

[0054] Step 2: Each of the 10 clients randomly updates its local model parameters proportionally based on the differences between the parameters of its local model and the global model. Starting from the second round, the model parameters will change significantly.

[0055] Step 3: Each of the 10 clients uses its own uneven and heterogeneous solar panel image dataset to train and update the parameters of its local model.

[0056] Step 4: All 10 clients will add differential privacy noise to the locally personalized updated model parameters and then upload them to the server.

[0057] Step 5: The 10 clients listen to the server for any new global model transmissions. If yes, they jump to step 1; otherwise, they continue listening.

[0058] A preferred embodiment of the present invention includes the following steps in its server-side workflow.

[0059] Step 1: The server receives local model parameters from 10 clients.

[0060] Step 2: The server performs optimal transmission alignment on the 10 uploaded local personalized model parameters, and performs model fusion based on the activation weight parameters to obtain the global model parameters on the server.

[0061] Step 3: The server determines the accuracy of the global model after updating the parameters and evaluates its performance on the validation dataset. If the global model accuracy does not reach the optimal level or does not exceed the threshold, the global model parameters are sent to the client to participate in the next round of random updates after parameter training. Training and updates stop when the global model training accuracy reaches the optimal level or exceeds the threshold.

[0062] An iterative graph of the global model accuracy performance in a preferred embodiment of the present invention is shown below. Figure 4 As shown in the examples, the privacy-preserving optimal transmission federated learning method for locally personalized client model parameter updates, as proposed in this invention, demonstrates significant training performance in solar panel defect detection applications. It can achieve privacy-preserving model federated learning even in complex situations with uneven data distribution, heterogeneity, and different data holders, ensuring the accuracy and generalization ability of model training. This provides an effective and reliable distributed training scheme for model generation in photovoltaic monitoring and maintenance scenarios.

Claims

1. A privacy-preserving optimal transmission federated learning method for locally personalized updating client model parameters, characterized in that, The algorithm includes the following steps: Step 1: Each client receives global model parameters from the server. Step 2: Each client randomly updates the parameters of its local model proportionally based on the parameter differences between the local model and the global model. Step 3: Each client uses its local dataset to train a portion of the updated local model and perform personalized parameter updates. Step 4: Each client adds differential privacy noise to the personalized updated local model parameters and uploads them to the server. Step 5: The server performs optimal transmission alignment on the uploaded local personalized updated model parameters, and performs model fusion based on the alignment probability weight parameters to obtain the updated global model parameters on the server. Step 6: Determine the accuracy of the updated global model and evaluate its performance on the validation dataset. If the global model accuracy does not reach the optimal level or does not exceed the threshold, the global model parameters will continue to be sent to the client to participate in the next round of parameter training and update. Step 7: Stop training and updating when the global model training accuracy reaches the optimal level or exceeds the threshold.

2. In step 2, each client randomly updates the parameters of its local model proportionally based on the parameter differences between the local model and the global model, including the following steps: Step 2.1, according to Calculate the sum of the inner products of the updated gradient and weight differences. .in It is the accumulated value of the inner product of the current gradient and weight difference, and its initial value is 0. It is the global model number Layer weights Compared with the local model Layer weights The differences between them, according to calculate. It is the reference gradient for personalized updates of the parameter part, according to calculate. It is the local model number 1 The gradient of the layer, initially set to all zeros. It is the global model number The gradient of the layer is initialized to all zeros. These are the mixing parameters used to update the model, and their initial value is 0; Step 2.2, according to Calculate the updated mixture parameters ,in The learning rate set for training the local model; Step 2.3, according to Calculate all weights of the updated local model, where It is the current client-normalized random selection matrix, used to select local parameters to participate in the update and realize the update of some personalized model parameters.

3. Step 4, where each client adds differential privacy noise to the personalized updated local model parameters and uploads them to the server, includes the following steps: Step 4.1, Define the privacy budget Used to control the risk of privacy leaks, the value range is: ; Step 4.2: Calculate the sensitivity based on the maximum absolute difference of the function outputs on adjacent datasets. This is used to measure the maximum impact of a change in a single data item on the output. This refers to the forward function of the local model after personalized updates. and It is any adjacent dataset; Step 4.3, according to σ = f ' C _ n e w D - f C _ n e w ' D ' ϵ ⋅ s q r t 2 l n 2 / δ Calculate noise level This determines the amount of noise to add to the output. It's a privacy budget. It's sensitivity; Step 4.4, according to After calculating and adding differential privacy noise, the personalized local model is uploaded. ,in, These are the locally updated model parameters.

4. In step 5, the server performs optimal transmission alignment on the uploaded local personalized model parameters, and performs model fusion based on the activation weight parameters to obtain the server's global model parameters, including the following steps: Step 5.1, select any two local models and set them as... , Set up forward propagation hooks for the two local models. Obtain and save the activation values ​​of each layer in both models separately. Obtain the probability weights of neuron alignment in each layer of the local model. , ,in These are the neuron layers to be arranged. It is the client identifier; Step 5.2, according to μ k - 1 l = α k - 1 l X k - 1 l and μ k l = α k l X k l definition , A measure of probabilistic similarity between two models. Among them, , They are , The two models are supported by metrics such as the activation function values ​​of neurons in the neuron layer. Step 5.3 defines the distance cost between neurons based on their characteristics and activation function values. (Transpose) Measurement support ,and Measurement support Matrix multiplication, combined and The distance cost matrix is ​​calculated by taking the squared norm of each point. ; Step 5.4, according to Calculate the neuron layer measurement , Optimal transport mapping between , where represents The process of calculating the optimal transport mapping with the goal of minimizing the cost function; Step 5.5, according to Use optimal transport mapping model Neurons relative to the model Alignment, where weights For relative to the model Weights before alignment, weights For relative to the model Aligned weights; Step 5.6, according to Calculate the updated global model parameters.