Federated learning method using synonym data

By using a synonym generator to generate synonymous data in federated learning, the problem of training performance degradation caused by long customer absences is solved, and the continuity and accuracy of model training are achieved, making it suitable for various customer departure scenarios.

CN116796862BActive Publication Date: 2026-06-19INVENTEC PUDONG TECH CORPOARTION +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INVENTEC PUDONG TECH CORPOARTION
Filing Date
2022-03-09
Publication Date
2026-06-19

Smart Images

  • Figure CN116796862B_ABST
    Figure CN116796862B_ABST
Patent Text Reader

Abstract

This invention provides a federated learning method using synonymous data, comprising: a coordinating device sending a general model to each client device; each client device executing a training procedure, including: an encoder encoding private data into a summary; training a client model based on the private data, the summary, and the general model; and sending the summary and client parameters of the client model to the coordinating device; the coordinating device identifying absent client devices among the client devices; generating synonymous data using a synonymous data generator based on the summary corresponding to the absent client device; training an alternative model based on the synonymous data and the summary corresponding to the absent client device; and performing an aggregation operation based on the alternative model parameters and the client parameters of each client device other than the absent client device to generate update parameters to update the general model; this invention addresses the problem of client departure by synthesizing representative client data in a coordinator.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of federated learning technology, and in particular to a federated learning method using synonymous data. Background Technology

[0002] Federated Learning (FL) addresses many privacy and data-sharing issues through centralized, distributed learning across devices. Most existing federated learning methods assume that the collaborative setup between clients can tolerate clients (moderators) temporarily disconnecting from the coordinator.

[0003] However, in reality, prolonged customer absences or departures may occur due to business competition or other non-technical reasons. Performance degradation can be severe when data is unbalanced, skewed, or non-independent and identically distributed (non-IID) among customers.

[0004] Another problem arises when the coordinator needs to evaluate the model and release it to consumers. Because the coordinator lacks access to private customer profiles, representative data is lost when customers cease collaboration, leading to significant bias in gradient updates and long-term training degradation in federated learning. Simply memorizing gradients during training is not a suitable solution, as gradients quickly become unrepresentative as iterations progress. Summary of the Invention

[0005] In view of the shortcomings of the prior art described above, the purpose of this invention is to propose a federated learning method using synonyms, which is a federated learning framework that addresses the problem of customer churn by synthesizing representative customer data through a coordinator.

[0006] To achieve the above and other related objectives, the present invention provides a federated learning method using synonymous data, comprising: a coordinating device sending a general model to each of a plurality of client devices; each client device executing a training procedure, including: an encoder removing privacy portions from private data and encoding the private data into a summary; training a client model based on the private data, the summary, and the general model; and sending the summary and client parameters of the client model to the coordinating device, the client parameters being associated with weights in the client model; the coordinating device identifying absent client devices; a synonymous data generator generating synonymous data based on the summary corresponding to the absent client device; the coordinating device training an alternative model based on the synonymous data and the summary corresponding to the absent client device; and the coordinating device performing an aggregation operation based on the alternative parameters of the alternative model and the client parameters of each client device other than the absent client device to generate update parameters to update the general model.

[0007] The foregoing description of the invention and the following description of the embodiments are intended to demonstrate and explain the spirit and principles of the invention, and to provide a further explanation of the scope of the patent application. Attached Figure Description

[0008] Figure 1 The diagram shown is a block diagram of a federated learning system using synonymous data according to an embodiment of the present invention.

[0009] Figure 2 This is a schematic diagram illustrating the relationship between the proprietary information, summary, and synonyms of the present invention in one embodiment;

[0010] Figure 3 and Figure 4 This is a schematic diagram showing an embodiment of the federated learning system using synonymous data according to the present invention.

[0011] Figure 5 The diagram shown is an internal architecture diagram of a customer model of the present invention in one embodiment.

[0012] Figure 6 The diagram shown is an internal architecture diagram of an alternative model of the present invention in one embodiment.

[0013] Figure 7 This is a schematic diagram showing the spatial and mapping results of the connection results of the proprietary data, abstracts, synonyms and features of the present invention in one embodiment;

[0014] Figure 8 The flowchart shown is an embodiment of the federated learning method using synonymous data of the present invention.

[0015] Figure 9 Displayed as Figure 8 A detailed flowchart of step S2 in one embodiment;

[0016] Figure 10 Displayed as Figure 9 A detailed flowchart of step S21 in one embodiment;

[0017] Figure 11 Displayed as Figure 8 A detailed flowchart of step S5 in one embodiment;

[0018] Figure 12 The flowchart shown is an embodiment of the federated learning method using synonymous data of the present invention in another embodiment;

[0019] Figure 13 Displayed as Figure 12 A detailed flowchart of step S7 in one embodiment; and

[0020] Figure 14 , Figure 15 , Figure 16 and Figure 17 The diagrams show the effectiveness of the general model of the present invention in four training scenarios in one embodiment.

[0021] Symbol Explanation

[0022] Ci, Cj: Client device

[0023] Mo: Coordination device

[0024] g: Synonym data generator

[0025] ε: Encoder

[0026] M1, i1, j1: Processors

[0027] M2, i2, j2: Communication circuit

[0028] M3, i3, j3: Storage circuit

[0029] P i P j Private Data

[0030] D, D i D j :summary

[0031] S, S i S j Synonyms

[0032] M i M j Customer Model

[0033] M: General Model

[0034] Alternative Model

[0035] U: Consumer

[0036] First feature extractor of customer model

[0037] Second feature extractor of customer model

[0038] The first feature output by the first feature extractor of the customer model

[0039] The second feature output by the second feature extractor of the customer model

[0040] C i Classifier for customer models

[0041] Prediction results

[0042] F P The first feature extractor of the alternative model

[0043] F D Second feature extractor for alternative models

[0044] f P The first feature output by the first feature extractor of the alternative model

[0045] f D The second feature output by the second feature extractor of the alternative model

[0046] C: Classifier for the alternative model

[0047] A, A0, B, F: Space

[0048] ε(S i Synonyms Summary

[0049] S1-S7, S21-23, S211-S214, S51-S54, S71-S74: Steps

[0050] C0, C1, C2, C3: Customer devices Detailed Implementation

[0051] The following detailed description of the features and characteristics of the present invention in the embodiments is sufficient to enable anyone skilled in the art to understand the technical content of the present invention and implement it accordingly. Based on the disclosure of this specification, the claims, and the drawings, anyone skilled in the art can easily understand the relevant concepts and characteristics of the present invention. The following embodiments further illustrate the viewpoints of the present invention in detail, but are not intended to limit the scope of the present invention in any way.

[0052] The detailed description of the embodiments of the present invention includes several technical terms, the definitions of which are as follows:

[0053] Client: An endpoint that provides data to join distributed training or federated learning; also known as a client device.

[0054] Coordinator: A service provider that collects models from multiple clients to aggregate them into a general model for providing services; also known as a coordinating device.

[0055] Private data: Data held by the customer that needs to be protected.

[0056] A digest is a shareable, representative piece of data used to represent private information. Digests do not contain the private information. The dimensions of a digest are typically less than those of the private information, but this is not a limitation.

[0057] Synonyms: A substitute for private data that does not require consideration of privacy issues. Synonyms and private data usually share the same domain.

[0058] Customer model: The model that each customer owns.

[0059] General Model: The model owned by the coordinator, which is aggregated from the client models.

[0060] Stochastic gradient descent (SGD): An optimization procedure that updates the parameters of a machine learning model based on a predefined loss function.

[0061] Federated Learning (FL): A collaborative training architecture used to train machine learning models without sharing customer data to protect data privacy.

[0062] Machine learning: a field of research that enables computers to learn without explicitly writing programs.

[0063] Loss function: The objective function of the optimization program, used to train machine learning models.

[0064] This invention proposes a federated learning system and a federated learning method that use synonymous data.

[0065] Figure 1 The diagram shown is a block diagram of an embodiment of the federated learning system using synonymous data according to the present invention. Figure 1As shown, the federated learning system using synonymous data includes a coordinating device Mo and multiple client devices Ci, Cj. The coordinating device Mo is communicatively connected to each of the multiple client devices Ci, Cj. In one embodiment, one of the following devices may be used as the coordinating device Mo or the client devices Ci, Cj: a server, a personal computer, a mobile computing device, and any electronic device used for training machine learning models.

[0066] The coordination device Mo includes a processor M1, a communication circuit M2, and a storage circuit M3. The processor M1 is electrically connected to the communication circuit M2, and the storage circuit M3 is electrically connected to both the processor M1 and the communication circuit M2.

[0067] The synonym generator g is used to generate synonym data based on the summary corresponding to the absent client device. In one embodiment, the synonym generator g is software running on the processor M1, but the present invention is not limited to the hardware used to execute the synonym generator g. The synonym generator g may be stored in the storage circuit M3 or the internal memory of the processor M1. Details of the synonym generator g will be described later when discussing the encoder ε.

[0068] Processor M1 is used to identify absent client devices Ci and Cj. In one embodiment, processor M1 checks the communication connection status between communication circuit M2 and each client device Ci and Cj to determine whether one or more of client devices Ci and Cj are disconnected, thus becoming absent client devices. Processor M1 is further used to initialize a general model, train a replacement model based on synonymous data and summaries corresponding to absent client devices, and perform aggregation operations based on the replacement parameters of the replacement model and the client parameters of each client device other than the absent client devices to generate update parameters to update the general model. In one embodiment, the replacement parameters, client parameters, and update parameters are the gradients of the neural network models corresponding to these parameters. Specifically, the replacement parameters are the gradients of the replacement models, the client parameters are associated with the weights in the client models, for example, they are the gradients of the client models, and the update parameters are the gradients of the general model. In one embodiment, the aggregation operation uses the FedAvg algorithm. In other embodiments, the aggregation operation uses the FedProx algorithm or the FedNora algorithm.

[0069] The federated learning system using synonym data proposed in this invention (which may be referred to as the FedSyn architecture) can train a synonym data generator g simultaneously with the training of a general model. In one embodiment, processor M1 is responsible for training both the general model and the synonym data generator g. In another embodiment, since the synonym data generator g can synthesize synonym data from the digest, the synonym data generator g should be protected against unintended access from any client devices Ci, Cj, thereby preventing potential data leakage or adversarial attacks. For example, access can be restricted using the account type or key of client device Ci.

[0070] Communication circuit M2 is used to send a general model to each client device Ci, Cj. Storage circuit M3 is used to store summaries, synonyms, general models, and alternative models sent by all client devices Ci, Cj to coordination device Mo. In one embodiment, storage circuit M3 is further used to store encoder ε.

[0071] The hardware architecture of each of the client devices Ci and Cj is basically the same, and therefore adopts... Figure 1 Taking the client device Ci as an example, the client device Ci includes a processor i1, a communication circuit i2, and a storage circuit i3. The processor i1 is electrically connected to the encoder ε and the communication circuit i2, and the storage circuit i3 is electrically connected to the processor i1 and the communication circuit i2.

[0072] The encoder ε is used to remove the private portion of private data and encode the private data into a digest. This invention does not limit the type of private data. For example, the private data is an integrated circuit diagram, and the private portion is a key circuit design within the integrated circuit diagram. For example, the private data is a product design diagram, and the private portion is a product logo. When the private data is an image, the encoder ε is, for example, an image processing tool that provides the function of cropping the private portion. When the private data is text containing personally identifiable information, the encoder ε is used to transform the original data, such as reducing the data dimensionality or masking specific strings. It should be noted that the encoder ε should not excessively perturb the data, such as adding excessive noise, rendering it unusable. In one embodiment, the encoder ε described in this invention can be implemented using an encoder in an autoencoder. In one embodiment, the dimension of the synonymous data is equal to the dimension of the private data. Furthermore, in one embodiment, the aforementioned communication circuit M2 is further used to send the encoder ε to each client device Ci, Cj. In other words, the coordinating device Mo and each client device Ci, Cj have the same encoder ε. In one embodiment, the encoder ε is software running on the processor i1, but the present invention is not limited to the hardware used to execute the encoder ε. The encoder ε may be stored in the storage circuit i3 or the internal memory of the processor i1.

[0073] Processors i1 and j1 are used to train client models based on proprietary data, summaries, and general models. In one embodiment, one of the following devices can be used as processors i1 and j1: Application Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), System-on-a-Chip (SOC), or Deep Learning Accelerator. The processor M1 of the coordination device Mo can also be one of the above devices.

[0074] Communication circuits i2 and j2 are used to send summaries and client parameters to the coordination device. In one embodiment, communication circuits i2 and j2 can employ a wired or wireless network. The communication circuit M2 of the coordination device Mo typically uses the same type of network as communication circuits i2 and j2.

[0075] Storage circuits i3 and j3 are used to store private data, summaries, general models, and client models. In one embodiment, one of the following devices can be used as storage circuits i3 and j3: Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), flash memory, and hard disk. The storage circuit M3 of the coordination device Mo can also use one of the above devices.

[0076] Please refer to Figure 2 . Figure 2 This diagram illustrates the relationship between proprietary information, summaries, and synonyms of the present invention in one embodiment, using a client device Ci as an example. Figure 2 As shown, encoder ε uses the client device Ci's private data P i As input, and output summary D i Synonym generator g with summary D i As input, output synonymous data S i This allows for the sharing or storage of digests (D) outside the client device Ci. iThis allows for training in federated learning. The use of summaries and synonyms is diverse and applicable to most existing architectures, enabling federated learning training in a variety of applications.

[0077] Figure 3 and Figure 4 The diagram shown is an overview of an embodiment of the federated learning system using synonymous data according to the present invention, and Figure 3 and Figure 4 These represent two distinct points in time during the training process. Figure 4 The corresponding time point is later than Figure 3 The corresponding time point.

[0078] exist Figure 3 At the point in time, both client devices Ci and Cj exist and are undergoing training.

[0079] Encoder ε will transfer private data P i Encoded as a summary D i Then, the abstract D i Send to coordinating device Mo. Client device Ci, based on proprietary data P. i Abstract D i And the general model M trains the customer model M i Note that Figure 3 Prior to the point in time represented, the client device Ci had received the general model M from the coordinating device Mo.

[0080] Encoder ε will transfer private data P j Encoded as a summary D j Then, the abstract D j Sent to coordinating device Mo. Client device Cj based on proprietary data P. j Abstract D j And the general model M trains the customer model M j Note that Figure 3 Prior to the point in time represented, client device Cj had received the general model M from coordinating device Mo.

[0081] Coordinating device Mo receives summary D from client devices Ci and Cj. i D j And store. The coordinating device Mo receives the client model M from the client devices Ci and Cj. i M j The system calculates the customer parameters and performs aggregation operations based on these parameters to generate updated parameters to update the general model M. Finally, the trained general model can be deployed on the consumer U's device.

[0082] exist Figure 4 At the point in time represented, the client device Ci existed, and its operation was consistent with... Figure 3The same applies, but client device Cj leaves and becomes an absent client device. To avoid the accuracy of the general model M being affected by the absent client device Cj, the synonym generator g of the coordination device Mo generates data based on the summary D corresponding to the absent client device Cj. j Generate synonymous data S j The coordinating device Mo then relies on synonymous data S j and the summary D corresponding to the absent customer device Cj. j The alternative model is trained, and aggregation operations are performed based on the alternative parameters of the alternative model and the customer parameters of each customer device Ci other than the absent customer device to generate updated parameters to update the general model M.

[0083] like Figure 3 and Figure 4 As shown, during the training of federated learning, the above embodiment handles potential customer absences by storing the private data P of each customer device Ci, Cj. i P j Encoded as a summary D i D j When the client device (such as Cj) leaves, the coordinating device Mo generates synonymous data S. j To represent the stored digest D j Private data P in j To continue training.

[0084] In the federated learning system using synonymous data proposed in this invention, each client device Ci, Cj encodes the private data used for training locally to generate a summary, and all summaries are sent to the coordinating device Mo for storage. Therefore, even if any client device leaves later, the training of federated learning can still continue.

[0085] Please refer to Figure 4 , Figure 5 and Figure 6 . Figure 5 The diagram shown is an internal architecture diagram of the customer model of the present invention in one embodiment, and is represented by the customer model M of customer device Ci. i Let's take an example. Figure 6 The diagram shown is an internal architecture diagram of an alternative model of the present invention in one embodiment, and is presented in the form of... Figure 4 The alternative model of the absent customer device shown is illustrated as an example.

[0086] like Figure 5 As shown, the customer model M of customer device Ci i Including the first feature extractor Second feature extractor And classifier C i First feature extractor With private data P iAs input, and output the first feature Second feature extractor Abstract D i As input, and output the second feature Classifier C i With the first feature and second feature The concatenation result is taken as input, and the prediction result is output.

[0087] like Figure 6 As shown, an alternative model of the coordinating device Mo. Including the first feature extractor F P Second feature extractor F D And classifier C. First feature extractor F P Synonymous material S j As input, and output the first feature f P Second feature extractor F D Abstract D j As input, and output the second feature f D Classifier C uses the first feature f P and the second feature f D The connection result is used as input, and the prediction result is output.

[0088] like Figure 5 and Figure 6 As shown, alternative model With customer model M i The same structure, primarily consisting of two feature extractors and one classifier, but using different data access methods. This is because the coordination device Mo cannot access the private data P. j However, it is possible to access the digest D. j Therefore, the synonym generator g can be used to generate synonym data S. j To continue training.

[0089] In typical federated learning, the training of the client model occurs at each client device. Client parameters (such as gradients) corresponding to each client model are sent to the coordinating device and then aggregated to update the general model. In federated learning systems using synonymous data, the client model M is generated when client device Ci is available. i Using private data P i and its abstract D i Conduct training, such as Figure 5As shown. Customer parameters corresponding to the customer model are sent to the coordinating unit Mo. Customer parameters corresponding to all customer devices Ci and Cj are aggregated to generate update parameters for updating the general model M. The customer parameters that absent customer devices (taking customer device Cj as an example) should provide are provided by the alternative model generated by the coordinating unit Mo. Provide, such as Figure 6 As shown.

[0090] In federated learning systems that use synonymous data, alternative models and customer model M i They have the same architecture, one difference being data access. When client device Cj is available, its private data P... j D is used to generate a summary for training. j Whenever client device Cj is absent, coordinating device Mo sends a summary D. j Reconstruct synonymous data S for absent customer device Cj j Training can continue. In this way, the training of the federated learning system using synonymous data proposed in this invention will not be interrupted, regardless of whether the client device Cj is present.

[0091] Figure 7 This diagram illustrates the spatial mapping of proprietary data, summaries, synonyms, and features in one embodiment of the present invention, and shows the encoder ε, synonym generator g, and feature extractor F for projection into different spaces. D F P .

[0092] like Figure 7 As shown, the private data P of all client devices Ci i and synonymous materials S i Create space A. All data in space A belonging to the same category form space A0. All data in A, including private data P... i and synonymous materials S i The space B is formed after transformation by encoder ε; where private data P i Located on the client device Ci, synonymous data S i Located on the coordination device Mo, it is generated by the synonymous data generator g based on the private data P. i produce.

[0093] The coordination device Mo uses the first feature extractor F P Generate the first feature f P Using the second feature extractor F D Generate the second feature f D First feature f P and the second feature f D The connection result {fP f D This forms space F.

[0094] like Figure 7 As shown, client device Ci will store private data P in space A0. i With the first feature extractor Extract the first feature Abstract D in space B i With the second feature extractor Extracting the second feature The connection result of these two features is The coordination device Mo will coordinate the synonymous data S in space A0. i With the first feature extractor F P Extract the first feature f P The summary D in space B i With the second feature extractor F D Extracting the second feature f D The connection result of these two features is {f} P f D All the connection results form a space F. Space F0 represents the space formed by connection results that have the same classification among the multiple connection results in space F. Figure 7 The client model trained on client device Ci will have the same classification results as the alternative model trained on coordinating device Mo, even though they use different training data. In other words, even though coordinating device Mo does not have access to the private data P of client device Ci. i Furthermore, even though client device Ci becomes an absent client device, coordinating device Mo can still communicate via synonymous data S. i and Abstract D i Training Alternative Models Reaching and owning private data P i The same training effect at the same time.

[0095] Figure 8 The flowchart shown is an embodiment of the federated learning method using synonym data according to the present invention, including steps S1 to S6. Step S1 is that the coordinating device sends a general model to each of the multiple client devices; step S2 is that each client device executes a training procedure; step S3 is that the coordinating device identifies absent client devices; step S4 is that a synonym data generator generates synonym data based on the summaries corresponding to the absent client devices; step S5 is that the coordinating device trains an alternative model based on the synonym data and the summaries corresponding to the absent client devices; and step S6 is that the coordinating device performs an aggregation operation based on the alternative parameters of the alternative model and the client parameters of each client device other than the absent client device to generate update parameters to update the general model.

[0096] The method proposed in one embodiment of the present invention can be seen as an extension of federated learning. The extension includes the design of newly introduced summaries and synonyms, as well as a new loss function proposed to update the synonym generator. These features help maintain model performance during training even if the client device may leave.

[0097] Federated learning training involves multiple iterative processes, while Figure 8 This demonstrates the details of one iteration of the procedure. Please refer to it as well. Figure 1 and Figure 8 In one embodiment, Figure 8 The method shown can be used Figure 1 The system shown.

[0098] In step S1, the coordinating device Mo pushes the general model M to each client device (hereinafter referred to as client device Ci to represent each client device).

[0099] In one embodiment, to ensure that all client devices Ci have the same encoder ε, step S1 further includes two steps: the coordinating device Mo sends the encoder ε to each client device Ci, and the coordinating device Mo stores the encoder ε. This invention fixes the encoder ε to avoid digest D. i Dependent on encoder ε, and let the summary D i Keep it constant in each iteration of the procedure.

[0100] For step S2, please refer to Figure 9 . Figure 9 Displayed as Figure 8 A detailed flowchart of step S2 in one embodiment includes steps S21 to S23. Step S21 is that the encoder removes the privacy portion from the private data and encodes the private data into a digest. Step S22 is that the client model is trained based on the private data, the digest, and the general model. Step S23 is that the digest and the client parameters of the client model are sent to the coordination device.

[0101] In step S2, as shown in step S21, each client device Ci will transfer its private data P i Encode into summary D i As shown in step S22, using private data P i and Abstract D i As input data, and using SGD to train the customer model M i As shown in step S23, the abstract D i and customer parameters Send to the coordinating unit Mo, where the summary D i It only needs to be transmitted once at the start of training. If the private data P... i If updated, the client device Ci needs to update the proprietary data P. iSynchronously generate the corresponding summary D i And send it to the coordinating device Mo.

[0102] For step S21, please refer to Figure 5 and Figure 10 . Figure 10 Displayed as Figure 9 A detailed flowchart of step S21 in one embodiment includes steps S211 to S214. Step S211 involves inputting private data into a first feature extractor to generate a first feature; step S212 involves inputting the summary into a second feature extractor to generate a second feature; step S213 involves inputting the concatenation result of the first and second features into a classifier to generate a prediction result; and step S214 involves inputting the prediction result and the actual result into a loss function, and adjusting the weights of at least one of the first feature extractor, the second feature extractor, and the classifier based on the output of the loss function.

[0103] For details on steps S211 to S213, please refer to the previous section on... Figure 5 The paragraph. In one embodiment of step S214, in each client device Ci, the training data is provided by private data P. i and the summary D generated by encoder ε i Composition. Each client device Ci trains a client model M. i When using the standard FedAvg algorithm, the ClientClassification Loss is employed, as shown in Method 1 below.

[0104] Formula 1: L Client =L CE (M i (P i D i ), y)

[0105] Among them, L CE It is cross entropy, M i (P i D i ) represents the predicted result, and y represents the actual result.

[0106] In steps S3 to S6, the coordinating device Mo collects all customer parameters. And confirm whether there are any absent client devices. If client device Cj is absent (due to intentional departure or network congestion), the coordinating device Mo generates an alternative model. To use from summary D j The generated synonym data S j Computational Alternative Model Alternative parameters The coordinating device Mo is aggregated and To update the general model M.

[0107] Please refer to Figure 5 , Figure 6 and Figure 11 . Figure 11 Displayed as Figure 8 A detailed flowchart of step S5 in one embodiment includes steps S51 to S54. Step S51 involves inputting synonymous data into a first feature extractor to generate a first feature; step S52 involves inputting the summary corresponding to the absent customer device into a second feature extractor to generate a second feature; step S53 involves inputting the concatenation result of the first and second features into a classifier to generate a prediction result; and step S54 involves inputting the prediction result and the actual result into a loss function, and adjusting the weights of at least one of the first feature extractor, the second feature extractor, and the classifier based on the output of the loss function.

[0108] like Figure 5 and Figure 6 As shown, due to customer model M i and alternative models The two models have similar structures; the only difference is the input data. Therefore, details of steps S51 to S53 can be found in the previous section on... Figure 6 The paragraph. In one embodiment of step S54, the loss function is the same as the loss function used in step S214.

[0109] Figure 12 The flowchart shown is a further embodiment of the federated learning method using synonymous data according to the present invention, wherein steps S1 to S6 are... Figure 10 Similarly, in this embodiment, step S7 is also included: updating the general model and the synonym data generator based on the output of the loss function.

[0110] Figure 13 Displayed as Figure 12 A detailed flowchart of step S7 in one embodiment includes steps S71 to S74. Step S71 involves inputting synonymous data into the encoder of the coordination device to generate a synonymous data summary. Step S72 involves inputting the synonymous data summary and the summary corresponding to the absent customer device into a first loss function to generate a data similarity loss. Step S73 involves inputting the predicted data generated by the general model and the actual data into a second loss function to generate a synonymous data classification loss. Step S74 involves calculating the weighted sum of the synonymous data classification loss and the data similarity loss as the coordination device loss, and updating the general model and the synonymous data generator based on the coordination device loss.

[0111] In one embodiment of step S71, since the coordinating device Mo has stored the encoder ε in step S1, it can be based on the synonymous data S i Generate a synonym summary ε(S) i ).

[0112] In step S72, in order to ensure the private data P i Projection and synonymous data S i The projections should be similar, and the data similarity loss is shown in Method 2 below.

[0113] Formula 2: L DSL =L MSE (ε(S), D)

[0114] Among them, L MSE The term is Mean Square Error Loss, where S and D represent all synonyms and summaries held by the coordinating device Mo, respectively. It's important to note that synonyms S are not generated only when a customer is absent; rather, all summaries D collected by the coordinating device Mo will generate corresponding synonyms S.

[0115] In step S73, in order to ensure synonymous data S i and Abstract D i It should be correctly classified by the general model M, and the Synonym Classification Loss is shown in Method 3 below.

[0116] Formula 3: L SCL =L CE (M(S i D i ),y)

[0117] Among them, L CE This refers to cross entropy, where y represents the actual result. Due to synonymous data S... i It is generated by the synonym data generator g, therefore L SCL and L DSL The convergence is equivalent to training the synonym generator g.

[0118] In step S74, the weighted sum represents the coordination loss L of the jointly trained synonym data generator g and the general model M. server As shown in Method 4 below.

[0119] Formula 4: L server =L DSL +λL SCL

[0120] Here, λ is a balancing hyperparameter, which is set to 1 in one embodiment.

[0121] Specifically, the present invention aims to enable the general model M to generate synonymous data S from the synonymous data generator g. i To facilitate this learning process, the present invention introduces an additional training procedure into the coordination device Mo, as shown in steps S71 to S74. The present invention proposes two concepts for the joint training of the general model M and the synonym generator g. Specifically, the present invention aims for the general model M to learn: how to best generate appropriate synonym data S. i ; and how to best perform classification, i.e., from synonymous data S i and Abstract D i Determine the prediction result The first concept is: train a general model M so that it can correctly classify data from the summary D. i Synonyms S i The information obtained. Figure 7 The symbol in is F P and F D The two sets of arrows implement this concept. The second concept is: let the synonym summary ε(S) i As similar as possible to private data D i Summary. For example... Figure 7 The two arrows labeled ε in spaces A and B are shown in the figure. The two loss functions proposed in steps S72 and S73 of this invention realize the above two concepts.

[0122] The algorithm below is a pseudocode for a federated learning method using synonymous data according to an embodiment of the present invention:

[0123]

[0124] Where M represents the general model, g represents the synonym data generator, and t represents the number of iterations. i P represents the customer model of customer device Ci. i L represents the private data of the client device Ci. client Indicates customer classification loss, Representing customer model M i The client parameters (gradient), S represents the alternative model. j Synonyms for absent customer device Cj, L represents the alternative parameters (gradients) of the alternative model. server This indicates a loss of the coordination device.

[0125] In summary, this invention proposes a federated learning method using synonymous data, a federated learning framework that addresses customer churn by synthesizing representative customer data through a coordinating mechanism. This invention also proposes a data memory mechanism to effectively handle customer absence. Specifically, this invention addresses the following three scenarios: 1. Unreliable customers; 2. Training after customer removal; 3. Training after adding customers.

[0126] In the federated learning training process, the following four training scenarios are common: 1. A client temporarily leaves during training; 2. A client permanently leaves training; 3. All clients leave training sequentially; 4. Multiple client groups join training at different time periods. Please refer to [reference needed]. Figure 14 , Figure 15 , Figure 16 and Figure 17 These four diagrams correspond to the four scenarios described above and present the accuracy of the general model, where C0, C1, C2, and C3 represent different client devices. This invention forces the client device containing the most samples (such as C2) to leave the training to highlight the impact on performance. As... Figures 14 to 17 As observed, common federated learning algorithms such as FedAvg, FedNova, and FedProx failed to maintain stable test accuracy across the four scenarios. In contrast, the federated learning method proposed in this invention achieved stable test accuracy in all scenarios. These experimental results demonstrate the robustness of the proposed federated learning method.

[0127] While the present invention has been disclosed above with reference to the foregoing embodiments, it is not intended to limit the invention. Any modifications and refinements made without departing from the spirit and scope of the invention are within the scope of patent protection of the present invention. For the scope of protection defined in this invention, please refer to the claims.

Claims

1. A federated learning method using synonymous data, characterized in that, include: A common model is sent from a coordinating device to each of multiple client devices; Perform a training procedure on each of the plurality of said client devices, including: An encoder is used to remove the private portion of a private document and encode the private document into a digest. A customer model is trained based on the private data, the summary, and the general model; and The summary and a customer parameter of the customer model are sent to the coordination device, the customer parameter being associated with a weight in the customer model; The coordination device determines an absent customer device among the plurality of customer devices; A synonymous data generator generates synonymous data based on the summary corresponding to the absent customer device; The coordination device trains an alternative model based on the synonymous data and the summary corresponding to the absent customer device; and The coordination device performs an aggregation operation based on an alternative parameter of the alternative model and the customer parameter of each of the plurality of customer devices other than the absent customer device to generate an update parameter to update the general model; The coordination device trains the alternative model based on the synonymous data and the summary corresponding to the absent customer device by: inputting the synonymous data into a first feature extractor to generate a first feature; inputting the summary corresponding to the absent customer device into a second feature extractor to generate a second feature; inputting a connection result of the first feature and the second feature into a classifier to generate a prediction result; and inputting the prediction result and an actual result into a loss function, and adjusting the weights of at least one of the first feature extractor, the second feature extractor, and the classifier based on the output of the loss function.

2. The federated learning method using synonymous data according to claim 1, characterized in that, The federated learning method using synonymous data also includes: The coordinating device sends the encoder to each of the plurality of client devices; and The encoder is stored in the coordination device.

3. The federated learning method using synonymous data according to claim 2, characterized in that, Following the update of the general model, the following is also included: The synonym data is input to the encoder of the coordination device to generate a synonym data summary; Input the synonym data summary and the summary corresponding to the absent customer device into the first loss function to generate a data similarity loss; Inputting a predicted data and an actual data generated by the general model into a second loss function to generate a synonymous data classification loss; and The weighted sum of the synonym classification loss and the data similarity loss is calculated as a coordination device loss, and the general model and the synonym generator are updated based on the coordination device loss.

4. The federated learning method using synonymous data according to claim 3, characterized in that, The first loss function is the mean squared error, and the second loss function is the cross-entropy.

5. The federated learning method using synonymous data according to claim 1, characterized in that, Training the customer model based on the private data, the summary, and the general model includes: Input the private data into the first feature extractor to generate a first feature; The summary is input into the second feature extractor to generate a second feature; A connection result of the first feature and the second feature is input into a classifier to generate a prediction result; The predicted result and an actual result are input into a loss function, and the weights of at least one of the first feature extractor, the second feature extractor, and the classifier are adjusted based on the output of the loss function.

6. The federated learning method using synonymous data according to claim 5, characterized in that, The loss function is cross-entropy.

7. The federated learning method using synonymous data according to claim 1, characterized in that, The dimension of the synonymous data is equal to the dimension of the private data.

8. The federated learning method using synonymous data according to claim 1, characterized in that, The customer model is trained using stochastic gradient descent based on the private data and the summary.

9. The federated learning method using synonymous data according to claim 1, characterized in that, The aggregation operation uses the FedAvg algorithm, FedProx algorithm, or FedNora algorithm.

Citation Information

Patent Citations

  • Apparatuses, computer program products, and computer-implemented methods for privacy-preserving federated learning

    US20210256309A1