Personalized federated learning method and system based on a sharing model

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By employing a personalized federated learning approach with shared models, and utilizing unique weight vectors and personalized prototype regularization terms, we can effectively integrate global knowledge with local characteristics. This approach addresses the issues of model complexity and adaptability in scenarios where data is not independent and identically distributed, thereby improving the robustness and efficiency of personalized models.

CN120494124BActive Publication Date: 2026-06-26CHONGQING ACADEMY OF SCI & TECH

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHONGQING ACADEMY OF SCI & TECH
Filing Date: 2025-04-15
Publication Date: 2026-06-26

Application Information

Patent Timeline

15 Apr 2025

Application

26 Jun 2026

Publication

CN120494124B

IPC: G06N20/00; G06F18/213; G06F18/24

AI Tagging

Technology Topics

PersonalizationLocal learning

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Deep neural network based personalized privacy risk dynamic measurement method
CN115809481BEnhance privacy awarenessReduce the difficulty of extractionDigital data protection Character and pattern recognition Personalization Feature vector
An individualized AI training method and system based on multi-modal life data and personality anchor point constraints
CN122413332APersonalization Feature extraction
AI toy personalized interaction generation method and system based on multi-modal emotion recognition
CN122417023APersonalization Feature extraction
A collaborative robot system for active grid marketing and service process thereof
CN122434575APersonalizationCustomer requirements
Model training method, planning and control information acquisition method, and device
WO2026143440A1Personalization Control engineering

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In scenarios where data is not independent and identically distributed, existing personalized federated learning methods struggle to effectively balance global knowledge with local characteristics, leading to increased model complexity, high computational and communication overhead, and insufficient adaptability in dynamically changing mobile environments.

Method used

A personalized federated learning approach using a shared model is adopted. By initializing the shared model on the server side and adaptively aggregating it on the client side, a unique weight vector and a personalized prototype regularization term are used to achieve an effective fusion of global knowledge and local characteristics, reducing state dependencies and communication overhead.

Benefits of technology

It improves the robustness and adaptability of the model in complex scenarios, protects client privacy, and enhances the performance and algorithm efficiency of personalized models.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120494124B_ABST

Patent Text Reader

Abstract

The application provides a personalized federated learning method and system of a shared model, comprising: a server end saving and initializing a shared model, a global model and a global class prototype of a client, and the client initializing a local learning weight vector; the server sending a shared model of other clients to one client, the client setting the local shared model as the global model and training, and uploading the shared model and the local class prototype to the server end; and the server calculating the global class prototype and the global model according to all received shared models and local class prototypes. The application solves the knowledge sharing problem in federated learning, improves the performance of the personalized model, solves the deviation problem in the local model training of the client, and the obtained personalized prototype contains more global information than the personalized prototype, so that the effective fusion of the global knowledge and the local characteristics is realized while reducing the state dependence and the communication overhead, so as to improve the robustness and adaptability of the model in a complex scene.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of federated learning technology, specifically, it relates to a personalized federated learning method and system based on a shared model. Background Technology

[0002] Federated learning, as a distributed machine learning framework, enables collaborative training of a global model across multiple clients while protecting user data privacy. However, in scenarios where data is not independently and identically distributed (Non-IID), the global model struggles to cater to the individual needs of each client, leading to performance degradation in local tasks. Therefore, personalized federated learning, by combining global knowledge with local data characteristics to generate customized models for each client, has become a current research hotspot.

[0003] Currently, some studies combine public and private models, allowing each client to personalize the global model. However, this approach struggles to handle situations with significant differences in data distribution among clients. Some studies introduce a personalization layer on top of the global model to achieve personalized learning, but this can increase model complexity, leading to higher computational and communication costs. Another adaptive local aggregation method relies heavily on local data adjustments during aggregation, resulting in slower global model convergence. Separating feature information through conditional policies and controlling these policies for appropriate feature sharing and adjustment between the global and client-side local models can be hampered by data imbalances or inconsistencies, affecting the effectiveness of the conditional policies and consequently the personalization results. Other studies utilize feature alignment and classifier collaboration to achieve personalized federated learning. However, feature alignment can be affected by data heterogeneity and non-independent identically distributed data, leading to poor alignment results. Classifier collaboration can also increase communication overhead, especially with a large number of clients.

[0004] The patent document "Personalized Federated Learning Method Based on Hybrid Expert Model" (CN112560991A) discloses a method that dynamically fuses a global classification layer and a personalized classification layer through a gating mechanism, and optimizes the gating decision using the output of the feature extraction layer. This method balances the contradiction between global and local knowledge to some extent. However, it requires the client to continuously maintain intermediate states during training, resulting in insufficient adaptability in large-scale mobile federated environments and limited expressive power when processing high-dimensional data. In particular, it has compatibility barriers with stateless clients, affecting the personalization effect. The hybrid expert model divides the local model into a feature extraction layer and a classification layer through a layered approach, with the feature extraction layer as a fixed base layer. Essentially, only the classification layer realizes personalization, i.e., a personalized classification layer.

[0005] The patent document "Personalized Federated Learning Method and System for Hybrid Multi-Stage Private Models" (CN117708877A) discloses a method that corrects local training bias by learning regularization terms through prototypes and weightedly fuses historical models and personalized models to retain global information. However, its multi-stage hybrid mechanism requires frequent transmission of local prototypes and model parameters, and its limitation and essence is that it only relies on the weighted aggregation of models from different historical periods of the local model. At the same time, the client prototype is essentially generated from a personalized model trained from the local model. The prototype aggregation process is highly sensitive to the distribution of client data and is prone to introducing bias in imbalanced class scenarios, affecting the representativeness of the global prototype.

[0006] Hybrid multi-stage private models and hybrid expert models both continuously improve upon local models to achieve personalization, aiming to learn global knowledge. Applying these methods to shared models is meaningless because the shared model merely uses it as a medium for sharing information. Personalized models are formed by weighted aggregation of shared models, a process in which the knowledge of each client—i.e., global knowledge—is already learned. Furthermore, traditional personalized federated learning schemes (such as model fine-tuning and multi-task learning) typically face the following challenges: over-reliance on local fine-tuning leads to the model losing global knowledge and reducing generalization ability; complex model structures or multi-stage training processes exacerbate client resource consumption; and some methods require clients to retain intermediate states for extended periods, making them difficult to adapt to dynamically changing mobile environments.

[0007] Therefore, there is an urgent need for an efficient and lightweight personalized federated learning method that can effectively integrate global knowledge and local characteristics while reducing state dependence and communication overhead, so as to improve the robustness and adaptability of the model in complex scenarios. Summary of the Invention

[0008] In view of the shortcomings of the prior art, the purpose of this invention is to provide a personalized federated learning method and system for shared models.

[0009] The personalized federated learning method for shared models provided by the present invention includes:

[0010] Step S1: The server saves and initializes the client's shared model, global model, and global class prototype, while the client initializes its local learning weight vector.

[0011] Step S2: The server sends the shared model of other clients to one client. The client sets its local global model as the shared model and trains and updates it. It then uploads the trained shared model and the local class prototype to the server.

[0012] Step S3: The server calculates the global class prototype and global model based on all received shared models and local class prototypes;

[0013] Repeat steps S2 and S3 to iterate through all clients.

[0014] Preferably, in step S1, the local learning weight vector is determined based on the similarity of data distribution among the N clients. The shared model is The global model is The global class prototype is ;

[0015] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j, compared to the shared model of other clients. Personalized model for client i The contributions are equal, i=1, 2, ..., N, j=1, 2, ..., N.

[0016] Preferably, in step S2, the server sends the shared model of other clients to client i. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to , = .

[0017] Personalized models are divided into presentation layers. and prediction layer h( );

[0018] Where x represents the input space;

[0019] These are the parameters for the presentation layer;

[0020] This represents the parameters of the decision-making level.

[0021] Client i only updates its own shared model Freeze client i download except Other shared models for other clients ,i≠j.

[0022] Preferably, in step S2, the client i performs the training step, including:

[0023] Step S2.1: Calculate the personalized model :

[0024] =

[0025] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j;

[0026] This represents a shared model.

[0027] Step S2.2: Calculate the local class prototype of category j in client i. :

[0028] =

[0029] Where x represents the input space;

[0030] y represents the label space of the category;

[0031] Represents the local dataset of client i The number of samples belonging to category j;

[0032] g( This indicates feature extraction from sample (x, y) in client i;

[0033] These are the parameters for the presentation layer.

[0034] Step S2.3: Calculate the local experience loss :

[0035] = +λ(r) +

[0036] in, Represents the classification loss function;

[0037] r represents the current training round;

[0038] λ(r) = (cos(rπ / R) + 1) / 2, which represents a monotonically decreasing function with respect to r;

[0039] R represents the total number of training rounds;

[0040] μ represents the coefficient of the proximal term;

[0041] = , indicating the proximal center;

[0042] Represents the regularization term coefficient;

[0043] This represents the regularization loss term;

[0044] Represents the global class prototype of category j;

[0045] This represents the number of class prototypes in the current input space x;

[0046] This represents the learning weight vector of client i;

[0047] This indicates that the input space x is processed by the personalized model. The output space obtained after calculation.

[0048] Step S2.4: Update the shared model :

[0049]

[0050] in, This represents the model learning rate.

[0051] Step S2.5: Update the learned weight vector :

[0052]

[0053] in, This represents the weight learning rate.

[0054] Preferably, step S3 includes:

[0055] Step S3.1: Calculate the global class prototype of category j. :

[0056] =

[0057] in, This represents the set of clients that possess data samples of category j;

[0058] This represents the number of clients that have data samples of category j;

[0059] This represents the number of class j in all data samples;

[0060] Represents the local dataset of client i The number of samples belonging to category j;

[0061] This represents the local class prototype of category j in client i.

[0062] Step S3.2: Calculate the global model :

[0063] =

[0064] in, Represents the local dataset of client i quantity;

[0065] Represents the shared model of client i;

[0066] N represents the total number of clients;

[0067] n= , representing the sum of the dataset sizes of N clients.

[0068] A personalized federated learning system for a shared model, provided by the present invention, includes: a server and N clients.

[0069] The client includes a shared model, a global model, and a global class prototype.

[0070] The server saves and initializes the client's shared model, global model, and global class prototype, while the client initializes its local learning weight vector.

[0071] The server sends the shared model of other clients to one client. The client sets its local global model as the shared model, trains and updates it, and uploads the trained shared model and local class prototype to the server.

[0072] The server calculates the global class prototype and global model based on all received shared models and local class prototypes.

[0073] The server iterates through all clients, repeatedly sending shared models, training updates, and computations.

[0074] Preferably, the local learning weight vectors of the N clients are based on the similarity of their data distributions. The shared model is The global model is The prototype of the global class is ;

[0075] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j, compared to the shared model of other clients. Personalized model for client i The contributions are equal, i=1, 2, ..., N, j=1, 2, ..., N.

[0076] Preferably, the server sends the shared model of other clients to client i. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to , = .

[0077] Personalized models are divided into presentation layers. and prediction layer h( );

[0078] Where x represents the input space;

[0079] These are the parameters for the presentation layer;

[0080] This represents the parameters of the decision-making level.

[0081] Client i only updates its own shared model Freeze client i download except Other shared models for other clients ,i≠j.

[0082] Preferably, the client i triggers the training module, including:

[0083] Module M2.1, Calculating Personalized Models :

[0084] =

[0085] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j;

[0086] This represents a shared model.

[0087] Module M2.2 calculates the local class prototype of category j in client i. :

[0088] =

[0089] Where x represents the input space;

[0090] y represents the label space of the category;

[0091] Represents the local dataset of client i The number of samples belonging to category j;

[0092] g( This indicates feature extraction from sample (x, y) in client i;

[0093] These are the parameters for the presentation layer.

[0094] Module M2.3, Calculate local experience loss :

[0095] = +λ(r) +

[0096] in, Represents the classification loss function;

[0097] r represents the current training round;

[0098] λ(r) = (cos(rπ / R) + 1) / 2, which represents a monotonically decreasing function with respect to r;

[0099] R represents the total number of training rounds;

[0100] μ represents the coefficient of the proximal term;

[0101] = , indicating the proximal center;

[0102] Represents the regularization term coefficient;

[0103] This represents the regularization loss term;

[0104] Represents the global class prototype of category j;

[0105] This represents the number of class prototypes in the current input space x;

[0106] This represents the learning weight vector of client i;

[0107] This indicates that the input space x is processed by the personalized model. The output space obtained after calculation.

[0108] Module M2.4, Update Shared Model :

[0109]

[0110] in, This represents the model learning rate.

[0111] Module M2.5, updating the learned weight vector :

[0112]

[0113] in, This represents the weight learning rate.

[0114] Preferably, the server calculates the global class prototype and global model including:

[0115] Module M3.1 calculates the global class prototype of category j. :

[0116] =

[0117] in, This represents the set of clients that possess data samples of category j;

[0118] This represents the number of clients that have data samples of category j;

[0119] This represents the number of class j in all data samples;

[0120] Represents the local dataset of client i The number of samples belonging to category j;

[0121] This represents the local class prototype of category j in client i.

[0122] Module M3.2, Calculating the Global Model :

[0123] =

[0124] in, Represents the local dataset of client i quantity;

[0125] Represents the shared model of client i;

[0126] N represents the total number of clients;

[0127] n= , representing the sum of the dataset sizes of N clients.

[0128] Compared with the prior art, the present invention has the following beneficial effects:

[0129] 1. The personalized model of this invention is composed of a set of unique weight aggregation and sharing models. By adopting the method of obtaining personalized models through adaptive aggregation on the client side, the privacy of the client is protected, the knowledge sharing problem in federated learning is solved, and the performance of personalized models is improved.

[0130] 2. This invention solves the offset problem in local model training on the client side by using a personalized prototype regularization module. The resulting personalized prototype contains richer global information, enabling it to learn more global knowledge and achieve better algorithm performance.

[0131] 3. This invention reduces state dependencies and communication overhead while effectively integrating global knowledge with local characteristics, breaking through the limitation of locality. The entire model is personalized to improve the robustness and adaptability of the model in complex scenarios. Attached Figure Description

[0132] Other features, objects, and advantages of the present invention will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings:

[0133] Figure 1 This is a schematic diagram of the personalized federated learning method for shared models. Detailed Implementation

[0134] The present invention will now be described in detail with reference to specific embodiments. These embodiments will help those skilled in the art to further understand the present invention, but do not limit the invention in any way. It should be noted that those skilled in the art can make several changes and improvements without departing from the concept of the present invention. These all fall within the protection scope of the present invention.

[0135] To address the statistical heterogeneity issue in federated learning, a personalized federated learning method with a shared model is proposed. Each client trains a shared model rich in personalized and global information under the guidance of a local personalized prototype. Simultaneously, utilizing its unique set of local learning weights, it adaptively learns from other shared models, thus achieving a personalized process for each client. Figure 1 For example, specifically including:

[0136] Step S1: The server saves and initializes the shared model for N clients. Global Model and global class prototype Client i (i=1,2,…,N) initializes its local learned weight vector. .

[0137] Specifically, based on the similarity of data distribution among clients, Client i plans to learn a certain proportion of knowledge from the shared model of client j, which will also serve as part of the shared model. Personalized model for client i The degree of contribution.

[0138] Step S2: The server sends the shared model of other clients to client i, i.e. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to... ,Right now = Then, client i performs the following training steps:

[0139] Step S2.1: Calculate the personalized model The specific calculation expression is as follows:

[0140] =

[0141] In the hybrid multi-stage private model, the current local model and the historical local model are weighted and mixed to obtain a new model. This new model is then trained using local data to obtain a personalized model. However, this personalized model is obtained by adaptively weighting and aggregating the shared model from all clients, thus overcoming the limitation of locality. Furthermore, personalization is performed at the model level, achieving overall model personalization.

[0142] Step S2.2: Calculate the local class prototype of category j The specific calculation expression is as follows:

[0143] =

[0144] Where x represents the input space and y represents the label space of the categories. Represents the local dataset of client i The number of samples belonging to category j. g( ) represents feature extraction on sample (x,y) in client i, where, These are its representation layer parameters.

[0145] We divide the personalized model into a representation layer. and prediction layer h( Where x represents the input space, Presentation layer parameters, This represents the decision-making layer parameters. Therefore, the local class prototype here is calculated by the representation layer in the personalized model and generated by the personalized model aggregated from the shared model of all clients. Thus, the resulting personalized prototype contains richer global information and exhibits better regularization.

[0146] Step S2.3: Calculate the local experience loss The specific calculation expression is as follows:

[0147] = +λ(r) +

[0148] in, Let represent the classification loss function, such as cross-entropy loss. r is the current training epoch. μ and λ(r) are the proximal term coefficients, where λ(r) is a monotonically decreasing function of r, which we define as λ(r) = (cos(rπ / R) + 1) / 2, where R is the total number of training epochs. It is the proximal center, which we define as = , It is the regularization coefficient. This is the regularization loss term; here we use the difference between the local class prototype and the global class prototype. Distance is used to assess the loss of this item. This represents the prototype of the local class of type j in client i. This represents the number of class prototypes in the current input space x. This represents the learning weight vector of client i. This indicates that the input space x is processed by the personalized model. The output space obtained after calculation.

[0149] Step S2.4: Update the shared model The specific update expression is:

[0150]

[0151] in, This represents the model learning rate. It's important to note that during the local update phase, client i only updates its own shared model. Freeze other shared models downloaded (i≠j).

[0152] Steps S2.1-S2.5 all belong to the local update phase of client i. = This refers to the initialization operations performed before the local update phase on the client. S2.4 is the initialization process during the local update phase. The update operation performed starting from this point. The client i download is excluding The client i shares the model with other clients, but does not train or update these models.

[0153] Step S2.5: Update the learned weight vector The specific update expression is:

[0154]

[0155] in, This represents the weight learning rate.

[0156] The personalized model adaptively determines the weights to learn from other clients based on the similarity between the local data distribution and the data distribution of other clients. This means the personalized model is composed of a unique set of weights aggregated into a shared model, avoiding blindly learning global knowledge and causing training divergence. This set of weights is only stored locally on the client, thus other clients cannot obtain the client's personalized model, protecting client privacy.

[0157] Step S2.6: Upload the shared model and local class prototype .

[0158] Step S3: The server calculates the global class prototype and global model, and performs the following steps:

[0159] Step S3.1: Calculate the global class prototype of category j The specific calculation expression is as follows:

[0160] =

[0161] in, For the set of clients that have data samples of category j, i.e. This indicates the number of its clients. This represents the number of class j in all data samples.

[0162] Step S3.2: Calculate the global model The specific calculation expression is as follows:

[0163] =

[0164] in, Represents the local dataset of client i quantity, Let n represent the sharing model for client i, N represent the total number of clients, and n = , representing the sum of the dataset sizes of N clients.

[0165] Repeat steps S2 and S3 above. From the perspective of a single client, this process is continuously looped to train and obtain a better model. From a global perspective, in each round of training, the server simultaneously distributes the model to all clients, and all clients simultaneously perform this local training. This process is repeated in the next round of training.

[0166] By incorporating personalized prototype regularization, the model learns more global knowledge. Furthermore, the personalized prototype contains richer knowledge to guide the training of the shared model. Therefore, a high-performance personalized model can be obtained, achieving superior algorithmic performance.

[0167] The present invention also provides a personalized federated learning system for a shared model, which can be implemented by executing the process steps of the personalized federated learning method for the shared model. That is, those skilled in the art can understand the personalized federated learning method for the shared model as a preferred embodiment of the personalized federated learning system for the shared model.

[0168] A personalized federated learning system for a shared model, provided by the present invention, includes: a server and N clients.

[0169] The client includes a shared model, a global model, and a global class prototype.

[0170] The server saves and initializes the client's shared model, global model, and global class prototype, while the client initializes its local learning weight vector.

[0171] The server sends the shared model of other clients to one client. The client sets its local global model as the shared model, trains and updates it, and uploads the trained shared model and local class prototype to the server.

[0172] The server calculates the global class prototype and global model based on all received shared models and local class prototypes.

[0173] The server iterates through all clients, repeatedly sending shared models, training updates, and computations.

[0174] In more preferred embodiments, the N clients learn local weight vectors based on the similarity of data distribution among the clients. The shared model is The global model is The prototype of the global class is ;

[0175] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j, compared to the shared model of other clients. Personalized model for client i The contributions are equal, i=1, 2, ..., N, j=1, 2, ..., N.

[0176] In more preferred embodiments, the server sends the shared model of other clients to client i. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to , = .

[0177] Personalized models are divided into presentation layers. and prediction layer h( );

[0178] Where x represents the input space;

[0179] These are the parameters for the presentation layer;

[0180] This represents the parameters of the decision-making level.

[0181] Client i only updates its own shared model Freeze client i download except Other shared models for other clients ,i≠j.

[0182] In more preferred embodiments, the client i triggers the training module, including:

[0183] Module M2.1, Calculating Personalized Models :

[0184] =

[0185] in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j;

[0186] This represents a shared model.

[0187] Module M2.2 calculates the local class prototype of category j in client i. :

[0188] =

[0189] Where x represents the input space;

[0190] y represents the label space of the category;

[0191] Represents the local dataset of client i The number of samples belonging to category j;

[0192] g( This indicates feature extraction from sample (x, y) in client i;

[0193] These are the parameters for the presentation layer.

[0194] Module M2.3, Calculate local experience loss :

[0195] = +λ(r) +

[0196] in, Represents the classification loss function;

[0197] r represents the current training round;

[0198] λ(r) = (cos(rπ / R) + 1) / 2, which represents a monotonically decreasing function with respect to r;

[0199] R represents the total number of training rounds;

[0200] μ represents the coefficient of the proximal term;

[0201] = , indicating the proximal center;

[0202] Represents the regularization term coefficient;

[0203] This represents the regularization loss term;

[0204] Represents the global class prototype of category j;

[0205] This represents the number of class prototypes in the current input space x;

[0206] This represents the learning weight vector of client i;

[0207] This indicates that the input space x is processed by the personalized model. The output space obtained after calculation.

[0208] Module M2.4, Update Shared Model :

[0209]

[0210] in, This represents the model learning rate.

[0211] Module M2.5, updating the learned weight vector :

[0212]

[0213] in, This represents the weight learning rate.

[0214] In more preferred embodiments, the server calculates the global class prototype and global model as follows:

[0215] Module M3.1 calculates the global class prototype of category j. :

[0216] =

[0217] in, This represents the set of clients that possess data samples of category j;

[0218] This represents the number of clients that have data samples of category j;

[0219] This represents the number of class j in all data samples;

[0220] Represents the local dataset of client i The number of samples belonging to category j;

[0221] This represents the local class prototype of category j in client i.

[0222] Module M3.2, Calculating the Global Model :

[0223] =

[0224] in, Represents the local dataset of client i quantity;

[0225] Represents the shared model of client i;

[0226] N represents the total number of clients;

[0227] n= , representing the sum of the dataset sizes of N clients.

[0228] Those skilled in the art will understand that, besides implementing the system and its various devices, modules, and units provided by this invention in the form of purely computer-readable program code, the same functions can be achieved entirely through logical programming of the method steps, making the system and its various devices, modules, and units of this invention function in the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, the system and its various devices, modules, and units provided by this invention can be considered as a hardware component, and the devices, modules, and units included therein for implementing various functions can also be considered as structures within the hardware component; alternatively, the devices, modules, and units for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.

[0229] Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the specific embodiments described above, and those skilled in the art can make various changes or modifications within the scope of the claims, which do not affect the essence of the present invention. Unless otherwise specified, the embodiments and features described in this application can be arbitrarily combined with each other.

Claims

1. A personalized federated learning method for a shared model, characterized in that, include: Step S1: The server saves and initializes the client's shared model, global model, and global class prototype, while the client initializes its local learning weight vector. Step S2: The server sends the shared model of other clients to one client. The client sets its local global model as the shared model and trains and updates it. It then uploads the trained shared model and the local class prototype to the server. Step S3: The server calculates the global class prototype and global model based on all received shared models and local class prototypes; Repeat steps S2 and S3 to iterate through all clients; In step S2, client i performs the training step, including: Step S2.1: Calculate the personalized model : = in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j; Represents a shared model; Step S2.2: Calculate the local class prototype of category j in client i. : = Where x represents the input space; y represents the label space of the category; Represents the local dataset of client i The number of samples belonging to category j; g( This indicates feature extraction from sample (x, y) in client i; These are the parameters for the presentation layer; Step S2.3: Calculate the local experience loss : = +λ(r) + in, Represents the classification loss function; r represents the current training round; λ(r) = (cos(rπ / R) + 1) / 2, which represents a monotonically decreasing function with respect to r; R represents the total number of training rounds; μ represents the coefficient of the proximal term; = , indicating the proximal center; Represents the regularization term coefficient; This represents the regularization loss term; Represents the global class prototype of category j; This represents the number of class prototypes in the current input space x; This represents the learning weight vector of client i; This indicates that the input space x is processed by the personalized model. The output space obtained after calculation; Step S2.4: Update the shared model : in, This represents the model learning rate; Step S2.5: Update the learned weight vector : in, This represents the weight learning rate.

2. The personalized federated learning method for shared models according to claim 1, characterized in that, In step S1, the local learning weight vector is determined based on the similarity of data distribution among the N clients. The shared model is The global model is The prototype of the global class is ; in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j, compared to the shared model of other clients. Personalized model for client i The contributions are equal, i=1, 2, ..., N, j=1, 2, ..., N.

3. The personalized federated learning method for shared models according to claim 2, characterized in that, In step S2, the server sends the shared model of other clients to client i. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to , = ; Personalized models are divided into presentation layers. and prediction layer h( ); Where x represents the input space; These are the parameters for the presentation layer; Indicates the parameters of the decision-making level; Client i only updates its own shared model Freeze client i download except Other shared models for other clients ,i≠j.

4. The personalized federated learning method for shared models according to claim 2, characterized in that, Step S3 includes: Step S3.1: Calculate the global class prototype of category j. : = in, This represents the set of clients that possess data samples of category j; This represents the number of clients that have data samples of category j; This represents the number of class j in all data samples; Represents the local dataset of client i The number of samples belonging to category j; This represents the local class prototype of category j in client i; Step S3.2: Calculate the global model : = in, Represents the local dataset of client i quantity; Represents the shared model of client i; N represents the total number of clients; n= , representing the sum of the dataset sizes of N clients.

5. A personalized federated learning system with a shared model, characterized in that, include: Server and N clients; The server saves and initializes the client's shared model, global model, and global class prototype, while the client initializes its local learned weight vector. The server sends the shared model of other clients to one client. The client sets its local global model as the shared model, trains and updates it, and uploads the trained shared model and local class prototype to the server. The server calculates the global class prototype and global model based on all received shared models and local class prototypes; The server iterates through all clients, repeatedly sending shared models, training updates, and computations; The client i triggers the training module, including: Module M2.1, Calculating Personalized Models : = in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j; Represents a shared model; Module M2.2 calculates the local class prototype of category j in client i. : = Where x represents the input space; y represents the label space of the category; Represents the local dataset of client i The number of samples belonging to category j; g( This indicates feature extraction from sample (x, y) in client i; These are the parameters for the presentation layer; Module M2.3, Calculate local experience loss : = +λ(r) + in, Represents the classification loss function; r represents the current training round; λ(r) = (cos(rπ / R) + 1) / 2, which represents a monotonically decreasing function with respect to r; R represents the total number of training rounds; μ represents the coefficient of the proximal term; = , indicating the proximal center; Represents the regularization term coefficient; This represents the regularization loss term; Represents the global class prototype of category j; This represents the number of class prototypes in the current input space x; This represents the learning weight vector of client i; This indicates that the input space x is processed by the personalized model. The output space obtained after calculation; Module M2.4, Update Shared Model : in, This represents the model learning rate; Module M2.5, updating the learned weight vector : in, This represents the weight learning rate.

6. The personalized federated learning system for shared models according to claim 5, characterized in that, The N clients are locally learned weight vectors based on the similarity of their data distributions. The shared model is The global model is The prototype of the global class is ; in, This represents the proportion of knowledge that client i plans to learn from the shared model of client j, compared to the shared model of other clients. Personalized model for client i The contributions are equal, i=1, 2, ..., N, j=1, 2, ..., N.

7. The personalized federated learning system for shared models according to claim 6, characterized in that, The server sends the shared model of other clients to client i. Global Model and the global class prototype of category j Meanwhile, client i sets its local shared model to , = ; Personalized models are divided into presentation layers. and prediction layer h( ); Where x represents the input space; These are the parameters for the presentation layer; Indicates the parameters of the decision-making level; Client i only updates its own shared model Freeze client i download except Other shared models for other clients ,i≠j.

8. The personalized federated learning system for shared models according to claim 6, characterized in that, The server calculates the global class prototype and global model, including: Module M3.1 calculates the global class prototype of category j. : = in, This represents the set of clients that possess data samples of category j; This represents the number of clients that have data samples of category j; This represents the number of class j in all data samples; Represents the local dataset of client i The number of samples belonging to category j; This represents the local class prototype of category j in client i; Module M3.2, Calculating the Global Model : = in, Represents the local dataset of client i quantity; Represents the shared model of client i; N represents the total number of clients; n= , representing the sum of the dataset sizes of N clients.

Citation Information

Patent Citations

Personalized federated learning method based on hybrid expert model
CN112560991A
Personalized federal learning method and system of mixed multi-stage private model
CN117708877A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Personalized federated learning method based on hybrid expert model

Personalized federal learning method and system of mixed multi-stage private model