Prototype-based Longitudinal Federated Learning System and Method for Multiple Heterogeneous Participants
By employing prototype aggregation in vertical federated learning, the problems of low training accuracy and data leakage of heterogeneous models are solved, achieving efficient collaborative training of multiple heterogeneous models and privacy protection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING INST OF TECH
- Filing Date
- 2023-10-10
- Publication Date
- 2026-06-30
Smart Images

Figure CN117272049B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a prototype-based vertical federated learning system and method for multiple heterogeneous participants, belonging to the fields of information security protection and federated learning technology. Background Technology
[0002] In real-world applications, data is often distributed across multiple stakeholders (such as banks and e-commerce platforms). To train a superior model, it is desirable for these stakeholders to share local data features to collaboratively perform data analysis and modeling. However, individual data holders worry about exposing their data to other parties or third parties, potentially leading to privacy breaches and the theft of trade secrets.
[0003] Currently, traditional machine learning methods aggregate all data onto a single server for global model training. However, these methods are prone to data leaks and privacy violations, necessitating a new data collaboration technology to ensure data security.
[0004] Vertical federated learning, as a novel machine learning technique, achieves collaborative training among participants by sharing models rather than data, which to some extent protects the security of the original data. Furthermore, vertical federated learning enables model training by jointly aligning features from samples across all participants, thus attracting widespread attention from academia and industry. However, existing vertical federated learning methods still have some limitations. When participants possess heterogeneous models locally, it is impossible to train a single model with superior performance. Additionally, existing vertical federated learning techniques cannot train multiple heterogeneous models simultaneously, failing to meet the requirements for model training when participants possess heterogeneous models.
[0005] Currently, there are two main types of vertical federated learning methods:
[0006] Type 1: Aggregation-based Vertical Federated Learning Training Method. First, the global model is distributed to both the active and passive parties. During training, the active and passive parties train their models using their own data, calculating the output predictions. The passive party sends its intermediate results to the active party in encrypted form, allowing the active party to assist in calculating the loss and gradients. The active party uses its own model's loss and the loss sent by the passive party to calculate the encrypted aggregated loss. Both the active and passive parties use the aggregated loss to calculate the gradients and update their local models.
[0007] However, this method fails to train high-performing model parameters when the participating local models are heterogeneous. This is because the aggregation-based vertical federated learning method aggregates the prediction results of the local models, and these prediction results contain both local model information and local feature information. Aggregating local model information negatively impacts the model's accuracy, resulting in slow convergence. Secondly, this method can only train a single global model and cannot train multiple heterogeneous models simultaneously, which presents significant limitations in practical applications.
[0008] Type 2: Split-Based Vertical Federated Learning Training Method. This method divides a global model into a top-level model and a bottom-level model. Each participant trains its bottom-level model using its own data. The passive participant sends its model output to the active participant, which aggregates the outputs from the passive participant and its own bottom-level model, using the aggregated result as input to the top-level model to continue the forward propagation process. After forward propagation, the active participant calculates the loss function and completes the gradient descent process of the top-level model. Finally, the active participant feeds back its gradients to the passive participant, which uses the received gradients to update its bottom-level model.
[0009] Similar to Type 1, this method also faces the challenge of training only one global model when participating parties have heterogeneous local models. This results in slow convergence and poor model performance, limiting the application of vertical federated learning. The local models of participating parties are often heterogeneous.
[0010] Therefore, in scenarios where the participants' local models are heterogeneous, how to effectively and simultaneously train multiple heterogeneous models and improve their performance is a key issue we currently face. Summary of the Invention
[0011] The purpose of this invention is to address the problems and defects of existing technologies, and to effectively solve the technical problems of data leakage and privacy infringement risks when sharing data in the case of heterogeneous clients, by creatively proposing a prototype-based vertical federated learning system and method for multiple heterogeneous participants.
[0012] First, the concepts and contents involved in this invention will be explained.
[0013] Participants: These refer to clients that possess local data and participate in federated learning training. Furthermore, participants are divided into two categories: active participants, which are clients possessing label values and local features, and passive participants, which are clients possessing only local features.
[0014] Vertical federated learning: refers to the federated learning training process in which participants achieve sample alignment and feature joint processing while ensuring that the local data of each participant does not leave the domain.
[0015] Representation layer: This refers to the part of the local heterogeneous network before the activation function. The representation layer embeds all local features into the same space and outputs local embedding values.
[0016] Decision layer: refers to the part after the activation function of the local heterogeneous network. The prediction results of the local heterogeneous network are obtained through the decision layer.
[0017] Prototype: Refers to the local embedding value output by the representation layer of a heterogeneous network. Since all participants embed their local features into the same space, the local embedding value is called the prototype.
[0018] The present invention is achieved using the following technical solution.
[0019] On the one hand, this invention proposes a prototype-based longitudinal federated learning system with multiple heterogeneous participants, including a system initialization module, a multi-heterogeneous model training module, and a system output module.
[0020] The system initialization module is used to initialize training tasks and select participants for vertical federated learning. It includes a training task initialization module and a participant selection module. The training task initialization module allows task publishers to publish relevant training tasks according to their needs, while the participant selection module allows task publishers to select participants for vertical federated learning training based on task requirements.
[0021] A multi-heterogeneous model training module is used for collaborative training and heterogeneous model updates among participants in vertical federated learning.
[0022] The system output module is used to output the convergent multi-heterogeneous model trained by the participants.
[0023] The task publisher calls the system initialization module to initialize the training task and select participants for the longitudinal federated learning; the participants also allow the multi-heterogeneous model training module to update the local heterogeneous model of the participants; the system output module outputs the updated local heterogeneous model of the participants.
[0024] The connection relationships between the above-mentioned components are as follows:
[0025] The output of the system initialization model is connected to the input of the multi-heterogeneous model training module, sending the contents of the system initialization model to the multi-heterogeneous model training module. The output of the multi-heterogeneous model training model is connected to the input of the system output module, sending the heterogeneous models trained by the multi-heterogeneous model training module to the system output module.
[0026] On the other hand, this invention proposes a prototype-based longitudinal federated learning method for multiple heterogeneous participants, comprising the following steps:
[0027] Step 1: Model Initialization. Participants initialize their local model parameters and local optimization algorithms respectively.
[0028] Step 2: Without exposing the participants' local data, each participant uses a privacy set intersection technique to find the intersection of all participants' data. The participants include passive and active parties.
[0029] Step 3: Each participant uses the local data and the representation layer of the local model to compute local embedding values.
[0030] Specifically, it includes the following steps:
[0031] Step 3.1: Each participant uses its local data features as input to the local model representation layer. After computation by the representation layer, the local embedding value is output.
[0032] Step 3.2: Each participant adds a random number to the local embedded value to protect the security of the original local data.
[0033] Step 3.3: Each participant sends a local embedded value with a random number to the initiator.
[0034] Step 4: The active party calculates the global embedding value.
[0035] Specifically, it includes the following steps:
[0036] Step 4.1: The active party collects all local embedded values with random numbers sent by the passive party;
[0037] Step 4.2: The initiator uses average aggregation to aggregate the local embedding values of all participants to obtain a global embedding value with a random number.
[0038] Step 4.3: The active party subtracts the random number from the global embedding value with the random number to obtain the true global embedding value, or simply the global embedding value.
[0039] Step 4.4: The initiator sends the global embedded value to all participants.
[0040] Step 5: All participants compute their local prediction results in parallel.
[0041] Specifically, it includes the following steps:
[0042] Step 5.1: Participants obtain the global embedding value.
[0043] Step 5.2: Participants use the global embedded values and the decision layer of the local model to calculate the local prediction results.
[0044] Step 5.3: The participating party sends the local prediction results to the active party.
[0045] Step 6: The active party calculates the loss value.
[0046] Specifically, it includes the following steps:
[0047] Step 6.1: The active party obtains the local prediction results from the participants.
[0048] Step 6.2: The active party uses the local label value and the local prediction results of the participants as inputs to the loss function to calculate the loss value of each participant.
[0049] Step 6.3: The initiating party sends the loss values of each participant to each participant.
[0050] Step 7: Each participant updates its local model through parallel backpropagation.
[0051] Specifically, it includes the following steps:
[0052] Step 7.1: The participants receive their loss values;
[0053] Step 7.2: The participants update and optimize the local model parameters based on the loss value and the local optimization function.
[0054] Step 8: Repeat steps 3 to 7 until the training model converges or reaches the pre-negotiated maximum number of training rounds, at which point training is stopped.
[0055] This enables vertical federated learning among multiple heterogeneous participants without data leakage or privacy violations.
[0056] Beneficial effects
[0057] The method of the present invention has the following advantages compared with the prior art:
[0058] 1. This invention improves the accuracy of training models in longitudinal federated learning with local models from heterogeneous participants. This invention employs prototype aggregation to gather the local knowledge values of all participants, reducing reliance on heterogeneous model information. This reduces the impact of heterogeneous model information from other participants on the accuracy of the training model, thereby improving the accuracy of the training model.
[0059] 2. This invention can train multiple heterogeneous models simultaneously, meaning that each participant's local model can be trained and a convergent model can be obtained.
[0060] 3. This invention offers strong privacy protection. It aggregates prototype data rather than raw data, thus reducing the risk of raw data leakage. Furthermore, by injecting random numbers into the local embedded values, the initiating party cannot obtain the participating party's local embedded values, further protecting the participating party's raw data. Attached Figure Description
[0061] Figure 1 This is a schematic diagram of the results of the system of the present invention.
[0062] Figure 2 This is a schematic diagram of the method of the present invention. Detailed Implementation
[0063] The technical solution of the present invention will now be clearly and completely described with reference to the accompanying drawings and embodiments. Obviously, the described embodiments are merely some embodiments of the present invention, and not all embodiments.
[0064] Example
[0065] like Figure 1 As shown in the figure, this embodiment provides a prototype-based longitudinal federated learning system with multiple heterogeneous participants, including a system initialization module, a multi-heterogeneous model training module, and a system output module. The task publisher calls the system initialization model to publish an image classification task and randomly selects C participants, where the C participants are divided into 1 active participant and K passive participants. The C participants call the heterogeneous longitudinal federated learning training model to train multiple convergent heterogeneous models. The system output module outputs the trained convergent heterogeneous models.
[0066] Furthermore, in this embodiment, C is set to 4 and K is set to 3.
[0067] Furthermore, the local dataset is MNIST, containing images with known image classification labels. The image classification labels range from 0 to 9.
[0068] Furthermore, three heterogeneous local models were set up, including fully connected neural network MLP, convolutional neural network CNN, and convolutional neural network LetNet.
[0069] like Figure 2 As shown. A prototype-based longitudinal federated learning method for multiple heterogeneous clients includes the following steps:
[0070] Step 1: Model initialization. Each participant selects one model from the three heterogeneous local models as its local model and chooses stochastic gradient descent as its local optimization algorithm.
[0071] Step 2: Without exposing the participants' local data, the four participants use privacy set intersection technology to find the intersection of all participants' data.
[0072] Step 3: Each participant computes local embedding values using the representation layer of local data and local models;
[0073] Step 3.1: Local data features of the kth participant As a local model representation layer The input is processed by the presentation layer, and the output is the local embedded value.
[0074] Step 3.2: The k-th participant embeds the local value. Add random number To protect the security of the original local data, a local embedded value with a random number is obtained.
[0075] Step 3.3: Each participant will embed a local value with a random number. Send to the initiator, A.
[0076] Step 4: The active party calculates the global embedding value;
[0077] Step 4.1: Active party A collects all local embedded values with random numbers sent by passive parties. Where k = 0 represents the local embedding value of the active party a;
[0078] Step 4.2: The initiating party aggregates the local embeddings of all participants using average aggregation to obtain a global embedding value with a random number.
[0079] Step 4.3: The active party subtracts the random number from the global embedding value containing the random number to obtain the true global embedding value (referred to as the global embedding value).
[0080] Step 4.4: The initiator sends the global embedded value to all participants.
[0081] Step 5: All participants compute their local prediction results in parallel.
[0082] Step 5.1: The k-th participant obtains the global embedding value E;
[0083] Step 5.2: The k-th participant uses the global embedding value E and the decision layer of the local model. The local prediction results were calculated.
[0084] Step 5.3: Participants submit local prediction results Send to the initiator, A.
[0085] Step 6: The active party calculates the loss value;
[0086] Step 6.1: The active party obtains the local prediction results from the participants.
[0087] Step 6.2: The initiating party combines the local label value Y with the local prediction results of the participating parties. The loss value of each participant is calculated as input to the loss function LF.
[0088] Step 6.3: The initiating party calculates the loss values of each participating party. Send to each participant k .
[0089] Step 7: Each participant updates its local model through parallel backpropagation;
[0090] Step 7.1: The k-th participant receives the loss value.
[0091] Step 7.2: The k-th participant determines the loss value. Update and optimize local model parameters using the local optimization function. in This represents the local learning rate of the k-th participant. This indicates that the gradient descent is implemented by the k-th participant using the locally chosen optimization function. This represents the local model parameters obtained from the update of the k-th client.
[0092] Step 8: Repeat steps 3 to 7 until the local models of all participants converge, at which point training is stopped.
[0093] The training results are summarized in Tables 1 and 2.
[0094] Table 1 Comparison of Algorithm Performance
[0095]
[0096] Table 2 Performance Comparison of Prototypes
[0097]
[0098] The above description is merely a preferred embodiment of the present invention, and the present invention should not be limited to the content disclosed in this embodiment and the accompanying drawings. Any equivalent or modified embodiments made without departing from the spirit of the present invention fall within the scope of protection of the present invention.
Claims
1. A prototype-based vertical federated learning system with multiple heterogeneous participants, characterized in that, It includes a system initialization module, a multi-heterogeneous model training module, and a system output module, all implemented using computing devices and network communication interfaces. The system initialization module includes a training task initialization module and a participant selection module; The training task initialization module receives the training requirement parameters from the task publisher through the processor of the computing device and generates standardized training task instructions; the participant selection module obtains the device computing power information, local model type and data feature dimension information of the candidate participants through the network communication interface, and selects vertical federated learning participants that meet the training task requirements through a compatibility verification algorithm. A multi-heterogeneous model training module is used for collaborative training and heterogeneous model updates among participants in vertical federated learning. The multi-heterogeneous model training module includes a privacy protection module, a prototype aggregation module, and a model update module. The privacy protection module adopts privacy set intersection technology in the field of cryptography to realize the intersection calculation of the local data of each participant through the computing devices of the participants, without exposing the original local data. The prototype aggregation module receives the local embedding values with random numbers sent by each participant through the network communication interface, and uses a distributed average aggregation algorithm to calculate the global embedding value with random numbers. The random numbers are generated by the random number generation unit of the participant's computing device and embedded into the local embedding value. The model update submodule executes the backpropagation algorithm through the processors of the participants and updates the local model parameters based on the loss value; The system output module receives the converged heterogeneous model output by the multi-heterogeneous model training module through the network communication interface, stores it in the memory of the computing device, and outputs the converged heterogeneous model to the task publisher and participants through the network communication interface. The system initialization module's output is connected to the input of the multi-heterogeneous model training module via a network communication interface, sending training task instructions and participant information to the multi-heterogeneous model training module. The multi-heterogeneous model training module's output is connected to the input of the system output module via a network communication interface, sending the trained converged heterogeneous model to the system output module. Each participant's computing device interacts with each module of the system via a network communication interface. The system is implemented using distributed computing devices and network communication technology, and includes the following steps: Step 1: Model initialization; Participants initialize local model parameters and configure the computation parameters of local optimization algorithms through the processor of their local computing devices; Step 2: Each participant performs a privacy set intersection technique in the field of cryptography through the privacy computing unit of its local computing device to obtain the data intersection of all participants without exposing the original local data; among them, the participants include the active party with label values and local features, and the passive party with only local features; Step 3: Local Embedded Value Computation and Privacy Protection; Step 3.1: Each participant inputs local data features into the representation layer of the local model through the processor of its local computing device, and calculates and outputs local embedding values; Step 3.2: Each participant generates a random number that conforms to a uniform distribution through the random number generation unit of its local computing device, and injects the random number into the local embedded value to form a local embedded value with random number. Step 3.3: Each participant sends its locally embedded value with a random number to the initiator's computing device via a network communication interface; Step 4: The active party calculates the global embedding value; Step 4.1: The active party collects all locally embedded values with random numbers sent by the passive parties through the communication unit of the local computing device; Step 4.2: The active party executes the average aggregation algorithm through the processor of the local computing device to calculate the average of the local embedding values with random numbers of all participants, and obtains the global embedding value with random numbers. Step 4.3: The active party uses the processor of the local computing device to subtract the average of the random numbers injected by each participant from the global embedding value containing random numbers, thereby eliminating random number interference and obtaining the true global embedding value. Step 4.4: The initiating party sends the globally embedded value to the computing devices of all participating parties through the network communication interface; Step 5: All participants compute their local prediction results in parallel; Step 5.1: Participants obtain the global embedded value through the communication unit of their local computing device; Step 5.2: Participants input the global embedded values into the decision layer of the local model through the processor of their local computing devices, and obtain the local prediction results through classifier calculations; Step 5.3: The participants send their local prediction results to the active party's computing device via the network communication interface; Step 6: Loss value calculation and distribution; Step 6.1: The active party receives the local prediction results from all participants through the communication unit of its local computing device; Step 6.2: The active party uses the processor of the local computing device to input the local label value and the local prediction results of each participant into the loss function to calculate the loss value of each participant; Step 6.3: The initiating party distributes the loss values of each participant to the corresponding participant's computing device through the network communication interface; Step 7: Local model update; Step 7.1: The participating party receives its own loss value through the communication unit of its local computing device; Step 7.2: The participants use the processor of their local computing devices to perform backpropagation based on the loss value and the local optimization algorithm to update the local model parameters; Step 8: Iterative training; repeat steps 3 to 7 until the local model converges or the pre-negotiated maximum number of training rounds is reached, then stop training.