Blockchain-based federated learning method and system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing blockchain technology into federated learning and building a trusted blockchain network, parameter aggregation is achieved through consensus among multiple aggregation nodes, thus solving the problems of cheating and security risks in traditional federated learning and achieving higher reliability and security.

CN118504707BActive Publication Date: 2026-06-19HUNAN TIAN HE GUO YUN TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HUNAN TIAN HE GUO YUN TECH CO LTD
Filing Date: 2024-05-09
Publication Date: 2026-06-19

Application Information

Patent Timeline

09 May 2024

Application

19 Jun 2026

Publication

CN118504707B

IPC: G06N20/00; G06F16/23; G06F16/27; G06F21/60; H04L9/00; H04L67/1095

AI Tagging

Application Domain

Database updating Database distribution/replication

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The parameter aggregation process in traditional federated learning is susceptible to collusion and cheating, leading to security risks and reliability issues.

Method used

By adopting a blockchain-based federated learning approach, a trusted blockchain network is constructed. The consensus of multiple aggregation nodes is used to complete the storage of the summary of the aggregation results of parameters. Homomorphic encryption technology and secret sharing scheme are used to generate the summary and store it on the blockchain, ensuring the trustworthiness and security of the aggregation results.

Benefits of technology

This improves the reliability and security of federated learning, effectively tolerating even cheating by a few participants and ensuring the stable execution of federated tasks.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN118504707B_ABST

Patent Text Reader

Abstract

This application relates to a blockchain-based federated learning method and system, applied to a blockchain-based federated learning system. The federated learning system includes multiple federated learning aggregation devices, each comprising a blockchain node. The blockchain nodes in each device construct a trusted blockchain network. In this method, the aggregation task is not executed by a fixed one or more nodes, but rather fully utilizes blockchain technology, with consensus among multiple aggregation nodes. Even if a few participants are compromised and cheat during the aggregation process, the method of this application can effectively tolerate faults to a certain extent, ensuring the stable and correct execution of the federated task, thereby improving the reliability and security of federated learning.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of federated learning technology and blockchain technology, and in particular to a blockchain-based federated learning method and system. Background Technology

[0002] Federated learning is a distributed learning approach in machine learning that allows for model training on data distributed across different locations while protecting data privacy. In traditional centralized machine learning, all data is typically sent to a central server for model training, which can raise potential privacy and data security risks. Federated learning, however, improves models without exposing the original data by training them locally on local devices and sharing only model updates. In federated learning, each participant (e.g., mobile devices, sensors) maintains its own model locally and trains it using local data. Updates to these local models are then sent to a central server, which aggregates and integrates these updates into a global model update. This global model update is then sent back to each participant, resulting in model improvements.

[0003] In federated learning, parameter aggregation refers to the process of integrating local model parameters from different devices or data centers to update the global model. This process occurs on a central server, where model parameter updates from various participants are collected and integrated to generate new global model parameters.

[0004] However, traditional aggregation processes are susceptible to collusion and cheating, resulting in a lack of reliability and security risks in the intermediate parameter aggregation process of federated learning. Summary of the Invention

[0005] Therefore, it is necessary to provide a reliable and secure blockchain-based federated learning method and system to address the aforementioned technical issues.

[0006] Firstly, this application provides a blockchain-based federated learning method applied to a blockchain-based federated learning system. The federated learning system includes multiple federated learning aggregation devices, each of which includes a blockchain node. The blockchain node in each device constructs a trusted blockchain network. The method includes:

[0007] After participating nodes in each federated learning aggregation device complete one round of local training using the dataset, they form intermediate data for model training.

[0008] Each participating node encrypts the intermediate data and distributes it to the aggregation node;

[0009] The aggregation nodes in each federated learning aggregation device perform aggregation processing on the intermediate data according to the aggregation algorithm to obtain the aggregation result and generate a summary for the aggregation result;

[0010] Each aggregation node sends a summary of the aggregation result to a consensus node on the blockchain. After the consensus node reaches a consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node.

[0011] Each participating node obtains a summary of the aggregated results of the most recent tasks from its local blockchain node;

[0012] Each aggregation node sends a summary of its aggregation results to the participating nodes that need to execute the next round of federated tasks;

[0013] Each participating node obtains a summary of the previous round of local aggregation results from the blockchain, compares it with the summary of the previous round of aggregation results sent back by the aggregation node, determines the previous round of aggregation results, and obtains the global model for this round of federated learning.

[0014] Secondly, this application also provides a blockchain-based federated learning system, which includes multiple federated learning aggregation devices, each of which includes a blockchain node, and the blockchain node in each device constructs a trusted blockchain network.

[0015] The participating nodes in each federated learning aggregation device are used to form intermediate data for model training after completing one round of local training using the dataset;

[0016] Each participating node is used to encrypt the intermediate data and distribute it to the aggregation node;

[0017] The aggregation nodes in each federated learning aggregation device are used to aggregate the intermediate data according to the aggregation algorithm, obtain the aggregation result, and generate a summary for the aggregation result;

[0018] Each aggregation node is used to send a summary of the aggregation result to a consensus node on the blockchain. After the consensus node reaches a consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node.

[0019] Each participating node is used to obtain a summary of the aggregated results of the most recent tasks from its local blockchain node;

[0020] Each aggregation node is used to send a summary of its aggregation results to the participating nodes that need to execute the next round of federated tasks;

[0021] Each participating node is used to obtain a summary of the previous round of local aggregation results from the blockchain, compare it with the summary of the previous round of aggregation results sent back by the aggregation node, determine the previous round of aggregation results, and obtain the global model of the current round of federated learning.

[0022] The aforementioned blockchain-based federated learning method and system do not have the aggregation task executed by one or more fixed nodes. Instead, it fully utilizes blockchain technology and is completed through consensus among multiple aggregation nodes. Even if a few participants are compromised and cheat during the aggregation process, the method in this application can effectively tolerate faults according to a certain proportion, enabling the federated task to be carried out stably and correctly, thereby improving the reliability and security of federated learning. Attached Figure Description

[0023] Figure 1 This is a schematic diagram of the structure of a blockchain-based federated learning system in one embodiment;

[0024] Figure 2 This is a flowchart illustrating a blockchain-based federated learning method in one embodiment;

[0025] Figure 3 This is a flowchart illustrating the steps of storing a summary of the aggregation result on the blockchain after consensus is reached by a consensus node in one embodiment.

[0026] Figure 4 This is a schematic diagram of the consensus process for the aggregation results of federated learning in one embodiment.

[0027] Figure 5 This is a schematic diagram of an aggregation node election mechanism in one embodiment;

[0028] Figure 6 This is a schematic diagram of the state machine of the participating nodes in a federated learning implementation example.

[0029] Figure 7 This is a schematic diagram of an atomic multicast model based on model parameter subset cutting, as shown in one embodiment.

[0030] Figure 8 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0031] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0032] The blockchain-based federated learning provided in this application embodiment can be applied to, for example... Figure 1 The example shown is a blockchain-based federated learning system. Figure 1 As shown, the federated learning system includes multiple federated learning aggregation devices. Each federated learning aggregation device includes a local learning framework for machine learning / neural networks, a secure aggregation engine, a blockchain SDK, and blockchain nodes. A trusted blockchain network is built by the blockchain nodes in each device. The blockchain nodes in the device play different roles depending on the role of the device and the resources it has. For example: 1) If the device does not participate in the aggregation task, the blockchain nodes in the device are observer nodes that only synchronize data; 2) If the device participates in the aggregation task, the blockchain nodes in the device are consensus nodes.

[0033] Based on this federated learning system, such as Figure 1 and Figure 2 As shown, blockchain-based federated learning methods include:

[0034] Step 1: After participating nodes in each federated learning aggregation device complete one round of local training using the dataset, they form intermediate data for model training.

[0035] Step 2: Each participating node encrypts the intermediate data and distributes it to the aggregation node.

[0036] Specifically, the secure aggregation engine of each federated learning aggregation device encrypts the intermediate model data and distributes it to the secure aggregation engines of other devices. During this process, if the aggregation algorithm uses homomorphic encryption, the aggregation engine sends the encrypted data in full to each aggregation node; if the aggregation algorithm uses secret sharing, the aggregation engine generates shares for the secret and distributes the shares to the aggregation engines of other devices according to the specific secret sharing algorithm.

[0037] Step 3: The aggregation nodes in each federated learning aggregation device perform aggregation processing on the intermediate data according to the aggregation algorithm to obtain the aggregation result and generate a summary for the aggregation result.

[0038] Specifically, an aggregation node refers to a federated learning aggregation device that participates in the aggregation task. Each participating device completes the aggregation task according to a specific aggregation algorithm (which may require multiple rounds of network communication) and generates a summary for the aggregation result.

[0039] Step 4: Each aggregation node sends a summary of the aggregation result to a consensus node on the blockchain. After the consensus node reaches a consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node.

[0040] Specifically, because blockchain has the characteristics of multi-party consensus and credible results, the aggregation behavior is carried out by the aggregation nodes that participate in the consensus, and the aggregation result is stored on the blockchain. Therefore, the aggregation result summary stored on the blockchain has been reached by a certain proportion (such as 70%) of the aggregation nodes and is credible.

[0041] Step 5: Each participating node obtains a summary of the aggregated results of the most recent tasks from its local blockchain node.

[0042] Step 6: Each aggregation node sends a summary of its aggregation results to the participating nodes that need to execute the next round of federated tasks.

[0043] Step 7: Each participating node obtains a summary of the previous round of local aggregation results from the blockchain, compares it with the summary of the previous round of aggregation results sent back by the aggregation node, determines the previous round of aggregation results, and obtains the global model for this round of federated learning.

[0044] Furthermore, with a global model, the next round of local training can be performed.

[0045] The aforementioned blockchain-based federated learning method does not have the aggregation task executed by one or more fixed nodes. Instead, it fully utilizes blockchain technology and is completed through consensus among multiple aggregation nodes. Even if a few participants are compromised and cheat during the aggregation process, the method described in this application can effectively tolerate faults to a certain extent, enabling the federated task to proceed stably and correctly, thereby improving the reliability and security of federated learning.

[0046] In another embodiment, each working node may employ a Byzantine fault-tolerant aggregation consensus algorithm, such as... Figure 3 and Figure 4 As shown, after the consensus nodes reach a consensus, they store a summary of the aggregation result on the blockchain, including:

[0047] Step 302: The consensus node on the blockchain selects the blockchain block-producing node.

[0048] Step 304: The block-producing node packages the summary of the aggregation result into a block, signs it with its own private key, and broadcasts the block to other consensus nodes.

[0049] Step 306: After receiving the block, the consensus node verifies the signature. If the signature verification is successful, it broadcasts the verification result to other consensus nodes.

[0050] Step 308: If each consensus node receives 2f+1 messages verifying successful signatures, it indicates that the block is valid. Each consensus node verifies whether the aggregation result of the block-producing node is the same as that of its own consensus node. If they are the same, the node broadcasts a summary of the aggregation result. Here, f represents the cheating and faulty nodes among the intermediate data aggregation nodes participating in federated learning.

[0051] Step 310: When each consensus node receives 2f+1 digests identical to its own, it confirms that the aggregation result is consistent with the aggregation execution result of its own node, indicating that a block can be produced. It then sends a block production success message to the block producing node and stores the digest on disk.

[0052] Step 312: After receiving 2f+1 acknowledgment messages, the block-producing node will write the digest to disk for storage.

[0053] Furthermore, after the summary information of the aggregation results is written to disk at each consensus node, local training participants in other federated tasks can synchronize the summary information from their local nodes to verify whether the received aggregation results are correct.

[0054] In this embodiment, by introducing blockchain consensus concepts into traditional federated learning, multiple parties supervise the parameter aggregation process and reach consensus on the aggregation results. This addresses the cheating and failure issues of centralized aggregation servers, ensuring the security, reliability, and trustworthiness of the intermediate parameter aggregation process in federated learning. This method can achieve a fault tolerance rate of 30% for aggregation nodes.

[0055] In another embodiment, research and analysis reveal a significant difference between federated learning and traditional distributed learning in their definitions: distributed learning utilizes relatively centralized, high-performance servers for data training, while federated learning considers training on mobile devices, whose battery life, network connectivity, and computing resources are extremely limited compared to high-performance servers. In federated learning, employing homomorphic encryption algorithms for secure aggregation places high demands on computing resources; similarly, using secret-sharing schemes for secure aggregation places high demands on network resources. Therefore, how to utilize resource-constrained devices for federated learning is a problem that needs to be addressed for its practical application.

[0056] This embodiment improves the algorithm for determining aggregation nodes, which can solve the problem that federated learning tasks cannot be performed normally when computing and network resources are limited.

[0057] from Figure 1 As can be seen, this application is based on achieving consensus aggregation results through multiple aggregation nodes. Therefore, which nodes should serve as aggregation nodes? Specifically, the method for determining aggregation nodes includes: identifying candidate aggregation nodes; continuously performing hash operations on the current block height of the blockchain to obtain the first hash value of each node for secure aggregation consensus; fixing each first hash value onto the ring, calling it a node; calculating a hash operation based on the unique identifier of the candidate consensus node to generate a second hash value, and distributing the second hash value onto the ring, calling it a stub; rotating the node in a preset direction to find the nearest stub, and the stub that is matched becomes the aggregation node participating in the next round of aggregation consensus.

[0058] like Figure 5 The diagram illustrates the election of aggregation nodes participating in consensus. We imagine the hash values calculated using a cryptographic function (SHA256 in this example) as a ring space. Assume that a secure aggregation consensus using homomorphic encryption algorithms or a secure aggregation consensus using a secret-sharing scheme consists of four nodes. First, hash operations are continuously performed on the current block height of the blockchain to obtain four hash values. These four hash values are fixed onto the ring (represented by circles in the example), and are called nodes. Then, hash operations are performed based on the unique identifiers of the candidate nodes for secure aggregation, generating hash values. These hash values are then assigned to the ring (represented by ellipses in the example), and are called stubs. The nodes are rotated clockwise (or counter-clockwise) to find the nearest stub. The stub that is matched becomes the node participating in the aggregation consensus. In the example, “c_node_3”, “c_node_4”, “c_node_5”, and “c_node_1” are the nodes participating in the next round of secure aggregation.

[0059] This application introduces a consistent hash ring technology, which enables all nodes to reach a consensus on the consensus nodes for this round.

[0060] In another embodiment, determining candidate aggregation nodes includes: monitoring the node state switching of each node using a node state machine; and determining nodes whose node states are task-safe aggregation nodes, strong network collaborative aggregation nodes, and computing power collaborative aggregation nodes as candidate aggregation nodes based on the node state machine.

[0061] Node state machine, such as Figure 6 As shown, the node status represented by "IT" is a local training node. This node will only use the dataset to train machine learning / deep learning locally and generate intermediate model data.

[0062] “TF” indicates a node status of secure aggregation node for this task, responsible for secure aggregation of intermediate dense state data of the model for the tasks it participates in; “OSNF” indicates a strong network collaborative aggregation node, which has high network bandwidth and low uplink and downlink network occupancy; “OSCF” indicates a node status of strong computing power collaborative aggregation node, which has strong computing power and low CPU load.

[0063] In another embodiment, such as Figure 6 As shown, "IT", "TF", "OSNF" and "OSCF" will switch according to certain conditions.

[0064] The switching conditions between the local training node "IT" and the security aggregation node "TF" for this task are as follows:

[0065] Switching condition C1: When using homomorphic encryption algorithm to achieve secure aggregation, and the participant is a computing party, the node state of the participant switches from the local training node "IT" to the cost task secure aggregation node "TF"; when using secret sharing scheme to achieve secure aggregation, and the node of the participant is a secret share processing node, the node state of the participant switches from the local training node "IT" to the cost task secure aggregation node "TF".

[0066] Switching condition “C2”: When a homomorphic encryption algorithm is used to achieve secure aggregation, and the participant is restored to the key custodian, the node state of the participant is switched from the secure aggregation node “TF” of this task to the local training node “IT”.

[0067] The switching conditions for the security aggregation node "TF" and the high-performance computing collaborative aggregation node "OSCF" in this task are as follows:

[0068] Switching condition C3: When a node has become the security aggregation node "TF" for this task and the CPU load has not exceeded the set threshold, the security aggregation node "TF" for this task switches to the strong computing power collaborative aggregation node "OSCF" state. At this time, it can participate in the security aggregation work of homomorphic encryption type tasks.

[0069] Switching condition C4: When a node has become a high-performance computing collaborative aggregation node "OSCF", but a task fails to execute, the high-performance computing collaborative aggregation node "OSCF" switches to the cost task security aggregation node "TF", and the CPU load threshold is reduced.

[0070] The switching conditions for the security aggregation node "TF" and the strong network collaborative aggregation node "OSNF" in this task are as follows:

[0071] Switching condition C5: When a node has become the security aggregation node "TF" for this task, and the network card speed is gigabit or higher and the network uplink and downlink bandwidth is below the threshold, the security aggregation node "TF" for this task switches to the strong network collaborative aggregation node "OSNF". At this time, the node can receive security aggregation work based on the secret sharing scheme.

[0072] Switching condition C6: When a node has become "OSNF" but a task fails, the strong network collaborative aggregation node "OSNF" switches to "TF", and the threshold for network uplink and downlink bandwidth is reduced.

[0073] The switching between the strong network collaboration aggregation node "OSNF", the strong computing power collaboration aggregation node "OSCF", and the security aggregation node "IT" for this task is as follows:

[0074] Switching condition C7: When a node has become a strong network collaborative aggregation node "OSNF" and continues to fail to execute federated tasks and its network capabilities decline, it will switch from "OSNF" to a local training node "IT".

[0075] Switching condition C8: When a node has become a high-performance computing collaborative aggregation node "OSCF" and continues to fail in executing federated tasks, and its computing power declines, it will switch from the high-performance computing collaborative aggregation node "OSCF" to a local training node "IT".

[0076] Among the participating nodes in the task, the nodes in the states of "TF" (Secure Aggregation Node for this Task), "OSNF" (Strong Network Collaborative Aggregation Node), and "OSCF" (Strong Computing Power Collaborative Aggregation Node) will become candidate aggregation nodes. The "TF" node will perform the secure aggregation for this federated task, the "OSNF" node will participate in the secure aggregation of other secret-sharing scheme federated learning, and the "OSCF" node will participate in the secure aggregation of other homomorphic encryption algorithm federated learning.

[0077] This aggregation node election algorithm consists of two parts: the first part selects candidate aggregation nodes, and the second part selects consensus nodes from among the candidate aggregation nodes. Addressing the issue of uneven resource distribution among heterogeneous nodes, it elects aggregation nodes based on multiple dimensions, including communication, computing, and storage capabilities. It integrates local information from weak clients into federated aggregation, resolving the "scattered" problem; and it ensures that federated participants are compatible with low-resource clients, addressing the "overutilization of resources" problem. Ultimately, it resolves the issues of single global models failing to converge or converging slowly in differentiated resource federated learning scenarios, as well as suboptimal model performance and unfair performance.

[0078] In federated learning for differentiated resources, the computing and network capabilities of the devices participating in the training are uneven. The federated learning process includes a stage of aggregating intermediate model data, and different aggregation schemes place different demands on the aggregation server. If homomorphic encryption algorithms are used to aggregate intermediate data, the computing power of the aggregation server is required. If secret sharing schemes are used, a large amount of network communication is required during the aggregation process, placing high demands on the server's bandwidth. However, regardless of whether a homomorphic encryption algorithm or a secret sharing aggregation scheme is used, a problem arises when training large-scale federated models: large-scale federated model training generates a large amount of intermediate model data (even exceeding GB levels). In Byzantine fault-tolerant aggregation devices, the local training device needs to transmit the intermediate data to each aggregation consensus node, which will result in a huge amount of network traffic, greatly reducing the efficiency of federated learning and even making federated learning unusable. This application designs a transmission method for large-scale model training based on atomic multicast of model parameter subsets and erasure coding technology. While ensuring the aggregation fault tolerance of federated learning, it effectively reduces the network transmission complexity and transmission volume of intermediate model data, and can ensure the availability of federated learning system in weak network scenarios.

[0079] Specifically, communication between aggregation devices can be achieved using multicast technology. Multicast is a communication method in computer networks used to send data from one source node to multiple destination nodes. It is the opposite of unicast and broadcast. In unicast communication, data is sent from one source node to a specific destination node. In broadcast communication, data is sent from one source node to all nodes in the network. Multicast communication, however, sends data from one source node to multiple nodes in a predefined target group. Multicast communication can effectively reduce network transmission load and bandwidth consumption because data only needs to be sent once, rather than being sent to each receiving node.

[0080] Specifically, such as Figure 7 The diagram illustrates an atomic multicast model based on a subset of model parameters. Each participating node encrypts the intermediate data and distributes it to the aggregation nodes. This process includes: each participating node encrypting the intermediate data to obtain ciphertext of the model parameters; splitting the ciphertext into subsets of model parameters, the number of subsets equal to the number of aggregation nodes participating in secure aggregation; and sending each subset to the aggregation nodes participating in secure aggregation via an atomic multicast system. Upon receiving each subset, the aggregation nodes multicast the subset to other aggregation nodes.

[0081] Assuming the amount of intermediate model data generated in the device after each round of local training is |B|, the number of consensus nodes participating in secure aggregation is m, and the number of training nodes is n, then the communication volume of the first round of multicast in the federated learning network after splitting transaction subsets is: Before the second round of multicast, the total amount of data received by each consensus node participating in secure aggregation was During the second round of multicast, the total amount of data sent by each consensus node participating in secure aggregation will be... The total communication volume generated by the entire federated learning network during the second round of multicast is Therefore, the total network throughput of the atomic multicast transmission method based on model parameter subsets is n×|B|+n(m-1)|B|, i.e., nm|B|. This is the same as the total communication throughput without using intermediate data subsets for transmission, but a data subset has already been formed during the intermediate data transmission process. Based on this data subset, this application optimizes the transmission method using erasure coding technology. This prevents the loss of subsets during transmission and reduces the network throughput. The network throughput after erasure coding optimization is reduced to n|B|logm, and it can prevent the loss of f copies of data (f being a redundancy value) during the second round of multicast.

[0082] In this embodiment, the erasure coding technique based on atomic multicast of model parameter subsets reduces the communication capacity and complexity of the client under weak network conditions by segmenting and multicasting the intermediate data after local model training. While reducing communication capacity and complexity, it does not sacrifice the security and fairness of the federated network, effectively improving the stability of federated learning in wide area networks and with IoT node participation. This method can reduce the communication volume of the federated learning network in scenarios with large-scale model data and has error correction capabilities for data loss, ensuring the availability of federated learning in weak networks.

[0083] This application proposes a blockchain-based federated learning method. First, it proposes a Byzantine fault-tolerant aggregation system for federated learning, which effectively addresses the issues of cheating and malicious behavior by aggregation nodes in Byzantine scenarios. Second, it designs an aggregation node election algorithm for the Byzantine fault-tolerant aggregation system. This algorithm focuses on differentiated computing and network resources, performs state transitions based on the resource availability of participating nodes, and selects the electoral node based on the node's state through a blockchain smart contract. Finally, to address the network transmission problem of intermediate data in large-scale federated learning under weak network conditions, it designs an atomic multicast technique for segmenting model parameters, effectively reducing the network complexity and transmission volume during large-scale data transmission in the Byzantine fault-tolerant aggregation system, making the Byzantine fault-tolerant aggregation federated learning system truly usable in weak network environments.

[0084] This application introduces blockchain consensus principles into traditional federated learning, enabling multiple parties to supervise the parameter aggregation process and reach consensus on the results, thus ensuring the security, reliability, and trustworthiness of the intermediate parameter aggregation process in federated learning. Addressing the issue of uneven resource distribution among heterogeneous nodes, it elects aggregation nodes based on multiple dimensions, including communication, computing, and storage capabilities. Furthermore, by segmenting and multicasting intermediate data trained on local models, it enhances the stability of federated learning in wide area networks and with the participation of IoT nodes. In summary, this application can be applied to the sharing and circulation of data elements in fields such as healthcare, industry, and finance, and can be used for data value mining in differentiated resource environments while prioritizing privacy data protection. This further ensures the practical security, availability, reliability, trustworthiness, and stability of cross-organizational and cross-institutional data modeling and prediction in differentiated resource environments in healthcare, industry, and finance.

[0085] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0086] Based on the same inventive concept, this application also provides a blockchain-based federated learning system for implementing the blockchain-based federated learning method described above. The solution provided by this system is similar to the implementation described in the above method; therefore, the specific limitations of the one or more blockchain-based federated learning system embodiments provided below can be found in the limitations of the blockchain-based federated learning method described above, and will not be repeated here.

[0087] In one embodiment, the federated learning system includes multiple federated learning aggregation devices, each of which includes a blockchain node, and the blockchain node in each device constructs a trusted blockchain network.

[0088] The participating nodes in each federated learning aggregation device are used to form intermediate data for model training after completing one round of local training using the dataset;

[0089] Each participating node is used to encrypt the intermediate data and distribute it to the aggregation node;

[0090] The aggregation nodes in each federated learning aggregation device are used to aggregate the intermediate data according to the aggregation algorithm, obtain the aggregation result, and generate a summary for the aggregation result;

[0091] Each aggregation node is used to send a summary of the aggregation result to a consensus node on the blockchain. After the consensus node reaches a consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node.

[0092] Each participating node is used to obtain a summary of the aggregated results of the most recent tasks from its local blockchain node;

[0093] Each aggregation node is used to send a summary of its aggregation results to the participating nodes that need to execute the next round of federated tasks;

[0094] Each participating node is used to obtain a summary of the previous round of local aggregation results from the blockchain, compare it with the summary of the previous round of aggregation results sent back by the aggregation node, determine the previous round of aggregation results, and obtain the global model of the current round of federated learning.

[0095] In one embodiment, the consensus node is used by the consensus nodes on the blockchain to select the blockchain block-producing node; the block-producing node packages the digest of the aggregation result into a block, signs it with its own private key when producing the block, and broadcasts the block to other consensus nodes; after receiving the block, the consensus node verifies the signature, and if the signature verification is successful, it broadcasts the verification result to other consensus nodes; if each consensus node receives 2f+1 messages verifying the signature, it indicates that the block is valid, and each consensus node verifies whether the aggregation result of the block-producing node is the same as its own, and if they are the same, it broadcasts the digest of the aggregation result; where f represents cheating nodes and faulty nodes among the intermediate data aggregation nodes participating in federated learning; when each consensus node receives 2f+1 digests that are the same as its own, it confirms that the aggregation result is consistent with its own aggregation execution result, indicating that a block can be produced, sends a block production success message to the block-producing node, and stores the digest on disk; after receiving 2f+1 confirmation messages for block production, the block-producing node stores the digest on disk.

[0096] In one embodiment, the system further includes an aggregation node analysis module, used to determine candidate aggregation nodes, continuously perform hash operations on the current block height of the blockchain to obtain the first hash value of each node in the secure aggregation consensus; fix each first hash value on the ring, called a node; calculate a hash operation based on the unique identifier of the candidate consensus node to generate a second hash value, and allocate the second hash value on the ring, called a stub; rotate the node in a preset direction to find the nearest stub, and the stub that is hit is the aggregation node to participate in the next round of aggregation consensus.

[0097] In one embodiment, the aggregation node analysis module is further configured to monitor the node state switching of each node using a node state machine; and to determine the nodes whose node states are task security aggregation node, strong network collaborative aggregation node, and computing power collaborative aggregation node as candidate aggregation nodes based on the node state machine.

[0098] The modules in the aforementioned blockchain-based federated learning system can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of the computer device as software, so that the processor can invoke and execute the corresponding operations of each module.

[0099] In one embodiment, a computer device corresponding to a federated learning device is provided, the internal structure of which can be shown in the following diagram: Figure 8 As shown, the computer device includes a processor, memory, and a network interface connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and databases. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface is used for communication with external terminals via a network connection.

[0100] Those skilled in the art will understand that Figure 8 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0101] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0102] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0103] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A blockchain-based federated learning method, characterized in that, An application is made to a blockchain-based federated learning system, the federated learning system comprising multiple federated learning aggregation devices, each of which includes a blockchain node, and the blockchain node in each device constructs a trusted blockchain network; the method includes: After participating nodes in each federated learning aggregation device complete one round of local training using the dataset, they form intermediate data for model training. Each participating node encrypts the intermediate data and distributes it to the aggregation node. The aggregation node in each federated learning aggregation device aggregates the intermediate data according to the aggregation algorithm to obtain the aggregation result and generates a summary for the aggregation result. Each aggregation node sends the summary of the aggregation result to the consensus node on the blockchain. After the consensus node reaches consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node. Each participating node obtains the summary of the aggregation result of the most recent task from its local blockchain node. Each aggregation node sends the summary of its own aggregation result to the participating node that needs to execute the next round of federated tasks. Each participating node obtains the summary of the previous round's local aggregation result from the blockchain, compares it with the summary of the previous round's aggregation result sent back by the aggregation node, determines the previous round's aggregation result, and obtains the global model for this round of federated learning. The process involves the consensus nodes storing a summary of the aggregation result on the blockchain after reaching consensus. This includes: consensus nodes on the blockchain selecting a block-producing node; the block-producing node packaging the summary of the aggregation result into a block, signing it with its own private key, and broadcasting the block to other consensus nodes; upon receiving the block, the consensus nodes verify the signature, and if the signature verification is successful, broadcast the verification result to other consensus nodes; if each consensus node receives 2f+1 messages confirming successful signature verification, the block is considered valid, and each consensus node verifies whether the aggregation result of the block-producing node is the same as its own; if they are the same, it broadcasts a summary of the aggregation result; where f represents cheating and faulty nodes among the intermediate data aggregation nodes participating in federated learning; when each consensus node receives 2f+1 summaries identical to its own, it confirms that the aggregation result is consistent with its own aggregation execution result, indicating that a block can be produced, sending a block production success message to the block-producing node, and storing the summary on disk; after receiving 2f+1 confirmation messages, the block-producing node stores the summary on disk.

2. The method of claim 1, wherein, Methods for determining aggregation nodes include: Determine candidate aggregation nodes. Continuously perform hash operations on the current block height of the blockchain to obtain the first hash value of each node in the secure aggregation consensus; Each first hash value is fixed to the ring and called a node; A hash operation is performed based on the unique identifier of the candidate consensus node to generate a second hash value, which is then assigned to the ring and called a stub. Rotate the node in the preset direction to find the nearest stake. The stake that is hit becomes the aggregation node to participate in the next round of aggregation consensus.

3. The method according to claim 2, characterized in that, The process of determining candidate aggregation nodes includes: Utilize a node state machine to monitor the node state transitions of each node; Based on the node state machine, nodes whose states are task-safe aggregation node, strong network collaborative aggregation node, and computing power collaborative aggregation node are identified as candidate aggregation nodes.

4. The method according to claim 3, characterized in that, The method of using a node state machine to monitor the node state transitions includes: When using homomorphic encryption to achieve secure aggregation, and the participant is a computing party, the node state of the participant switches from the local training node to the cost task secure aggregation node. When a secret sharing scheme is used to achieve secure aggregation, when a participating node acts as a secret share processing node, the node state of the participating node switches from the local training node to the cost task secure aggregation node. When a homomorphic encryption algorithm is used to achieve secure aggregation, and a participant is restored to the key custodian, the node state of the participant is switched from the secure aggregation node of this task to the local training node. When a node has become a secure aggregation node for this task and the CPU load does not exceed the set threshold, the secure aggregation node for this task will switch to a high-computing-power collaborative aggregation node. When a node has become a high-performance computing power aggregation node, but a task fails to execute, the high-performance computing power aggregation node switches to a low-cost task safe aggregation node. When a node has become a security aggregation node for this task, and the network card speed is gigabit or higher and the network uplink and downlink bandwidth is below the threshold, the security aggregation node for this task will switch to a strong network collaborative aggregation node. When a node has become a strong network collaborative aggregation node, but a task fails to execute, the strong network collaborative aggregation node switches to a cost task safe aggregation node. When a node has become a strong network collaborative aggregation node or a strong computing power collaborative aggregation node, and continues to fail in executing federated tasks, or when computing power or network capabilities decline, it will switch from a strong network collaborative aggregation node or a strong computing power collaborative aggregation node to a local training node.

5. The method according to any one of claims 1 to 4, characterized in that, Each participating node encrypts the intermediate data and distributes it to the aggregation node, including: Each participating node encrypts the intermediate data to obtain the ciphertext of the model parameters; The encrypted model parameters are split into subsets to obtain a subset of model parameters. The number of subsets is equal to the number of aggregation nodes participating in secure aggregation. Each subset is sent to the aggregation nodes participating in secure aggregation via an atomic multicast system. After receiving each subset, the aggregation nodes participating in secure aggregation multicast the subset to other aggregation nodes. 6.A blockchain-based federated learning system, characterized in that, The federated learning system includes multiple federated learning aggregation devices, each of which includes a blockchain node, and the blockchain node in each device constructs a trusted blockchain network. The participating nodes in each federated learning aggregation device are used to form intermediate data for model training after completing one round of local training using the dataset; Each participating node is used to encrypt the intermediate data and distribute it to the aggregation node; The aggregation nodes in each federated learning aggregation device are used to aggregate the intermediate data according to the aggregation algorithm, obtain the aggregation result, and generate a summary for the aggregation result; Each aggregation node is used to send a summary of the aggregation result to a consensus node on the blockchain. After the consensus node reaches a consensus, it stores the summary of the aggregation result on the blockchain. The consensus node is the blockchain node of the aggregation node. Each participating node is used to obtain a summary of the aggregated results of the most recent tasks from its local blockchain node; Each aggregation node is used to send a summary of its aggregation results to the participating nodes that need to execute the next round of federated tasks; Each participating node is used to obtain a summary of the previous round of local aggregation results from the blockchain, compare it with the summary of the previous round of aggregation results sent back by the aggregation node, determine the previous round of aggregation results, and obtain the global model of the current round of federated learning. The consensus node is used by the consensus node on the blockchain to select the blockchain block-producing node. The block-producing node packages the digest of the aggregation result into a block, signs it with its own private key, and broadcasts the block to other consensus nodes. Upon receiving the block, each consensus node verifies the signature. If the signature verification is successful, it broadcasts the verification result to other consensus nodes. If each consensus node receives 2f+1 messages confirming successful signature verification, the block is considered valid. Each consensus node then verifies whether the aggregation result of the block-producing node is the same as its own. If they are the same, it broadcasts the digest of the aggregation result. Here, f represents cheating and faulty nodes among the intermediate data aggregation nodes participating in federated learning. When each consensus node receives 2f+1 digests identical to its own, it confirms that the aggregation result is consistent with its own aggregation execution result, indicating that a block can be produced. It then sends a block production success message to the block-producing node and stores the digest on disk. The block-producing node, upon receiving 2f+1 confirmation messages, stores the digest on disk.

7. The system according to claim 6, characterized in that, The system also includes an aggregation node analysis module, used to determine candidate aggregation nodes, continuously perform hash operations on the current block height of the blockchain to obtain the first hash value of each node in the secure aggregation consensus; fix each first hash value to the ring, called a node; calculate the hash operation based on the unique identifier of the candidate consensus node to generate a second hash value, and distribute the second hash value to the ring, called a stub; Rotate the node in the preset direction to find the nearest stake. The stake that is hit becomes the aggregation node to participate in the next round of aggregation consensus.

8. The system of claim 7, wherein, The aggregation node analysis module is also used to monitor the node state switching of each node using the node state machine; and to determine the nodes whose node state is a task security aggregation node, a strong network collaborative aggregation node, or a computing power collaborative aggregation node as candidate aggregation nodes based on the node state machine.