An enhanced structured federated graph learning method
By enhancing the structured federated graph learning method, calculating client contribution scores, and optimizing model parameter aggregation, the inefficiency of federated graph learning on non-independent and imbalanced datasets is addressed, achieving high accuracy and privacy protection, responding to user forgetting requests, and optimizing model synchronization and collaboration.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- DALIAN UNIV OF TECH
- Filing Date
- 2024-07-25
- Publication Date
- 2026-06-26
Smart Images

Figure CN118886482B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of artificial intelligence and machine learning, and specifically relates to an enhanced structured federated graph learning method. Background Technology
[0002] In the field of modern data science, especially in sensitive industries involving personal privacy and data protection (such as healthcare and cybersecurity), a crucial question arises: how to effectively utilize dispersed data resources while protecting user privacy. Traditional graph convolutional networks typically require centralized storage of large-scale graph data, but this practice may conflict with privacy regulations and encryption protocols. This leads to a problem known as data silos, where data is stored in isolation and cannot be shared across sectors, thus hindering scientific innovation and the effective use of resources.
[0003] To address this issue, federated graph learning emerged as a novel federated learning method specifically designed for graph neural networks. Through federated learning techniques, federated graph learning allows data from various data holders to be utilized securely and compliantly, optimizing graph convolutional neural network models. In federated graph learning, each client only shares the training parameters of its local model, not the data itself. This protects data privacy while improving the model's learning performance through collaboration among clients.
[0004] However, federated graph learning faces more challenges compared to traditional federated learning. Due to the need to process and optimize structural information among clients, federated graph learning is particularly complex in terms of parameter optimization. To address this, researchers have explored various model aggregation processes, mainly categorized into centralized aggregation and fully distributed transmission. Centralized aggregation relies on a central server to update the parameters of all clients and naturally considers the structural information among clients. In fully distributed transmission, to solve the potential communication bottleneck problem of the central server, researchers have proposed a serverless solution where each client directly exchanges model parameters with its neighbors and performs model aggregation and updates locally.
[0005] While these strategies each have their advantages and disadvantages, none have fully considered the user's "right to be forgotten." With the development of federated graph learning technology, it has become increasingly important to allow users to request the deletion of their data while protecting user privacy. To this end, this invention proposes an enhanced structured federated graph learning method. This method not only improves the efficiency of federated graph learning on non-independent and identically distributed (i.i.d.) and imbalanced datasets, but also, for the first time, addresses the user's right to be forgotten. Through an innovative forgetting method, the global model can quickly respond to data holders' requests for data deletion. Summary of the Invention
[0006] To overcome the shortcomings of existing federated graph learning methods, such as poor performance on non-independent and identically distributed (i.i.d.) and imbalanced datasets, and the neglect of user forgetting rights, this invention proposes an enhanced structured federated graph learning method for handling non-independent and identically distributed data. This method combines federated learning and federated forgetting learning techniques, aiming to improve the accuracy of federated graph learning in various complex scenarios and handle user-initiated forgetting requests, thus improving model synchronization and optimization issues in federated learning environments.
[0007] The technical solution proposed by this invention to address the above-mentioned technical problems is as follows:
[0008] An enhanced structured federated graph learning method includes:
[0009] Initialize a basic reputation score for each client, and calculate four indicators for each client: data value, effort level, risk assessment, and cost-effectiveness of investment.
[0010] The reputation score of the client before each round of aggregation is calculated based on the aforementioned indicators. The reputation score is then converted into a contribution level using a normal distribution, and the contribution level is normalized to obtain the final contribution score.
[0011] The client's local model parameters are input into a channel attention mechanism using discrete cosine transform to generate an attention vector; the attention vector is then converted into a client similarity score using entropy weighting, while ensuring that high weights are assigned to clients with excellent performance.
[0012] The server performs a weighted aggregation of global model parameters based on the client contribution score and similarity score.
[0013] Furthermore, the method also includes:
[0014] When a client makes a forgetting request, each local client's dataset is treated as a triple; the client dataset is divided into a forgetting set containing all requested forgetting data and a complement set, the retaining set.
[0015] Combine hard and soft obfuscation losses to calculate the cumulative interference loss; minimize the weighted local forgetting target loss to optimize the local model; minimize the sum of weighted local forgetting target losses to update the global model; train on the retained set to restore the model's generalization ability.
[0016] Furthermore, the initialization of each client's basic reputation score and the calculation of the four indicators specifically include:
[0017] Initialize basic reputation score: For a set of federated graph learning clients N = {1, 2, 3, ..., n}, the server sets an initial basic reputation score for each client. in Represents the initial basic reputation score of the i-th client;
[0018] Calculate data value: In the t-th iteration, use the accuracy of the local model. To quantify the data value of client i (v) i ;
[0019] Assess effort level: The difference between the accuracy of the local model in this round and the average accuracy of the local model are taken as the effort level e of client i. i ;like A higher-than-average accuracy rate indicates that the client is performing well;
[0020] Risk assessment: Accuracy of the average aggregated global model The difference between them serves as a risk indicator g. i ;
[0021] Calculate the cost-benefit ratio of the investment: Calculate the accuracy of the global model in the current round. Compared with the previous round of global model accuracy The difference between them is used as the cost-benefit ratio of client i's investment. i .
[0022] Furthermore, the contribution score is calculated as follows:
[0023] Calculate the reputation score for each client i before each round of aggregation. To balance the impact of historical information and the number of current model updates, among which P i P represents the contribution cost ratio. i =e i ×c i ;
[0024] The reputation score of client i in each round is calculated using a normal distribution. Divide into five different regions and assign reputation scores to each client i. The contribution level is assessed by multiplying it by its corresponding regional coverage area ω. i ;
[0025] The formula for calculating the client contribution level integrated into risk assessment is as follows: After normalizing φ(i), we obtain the client's contribution score Φ(i).
[0026] Furthermore, the formula for calculating the client contribution level is as follows:
[0027]
[0028] Furthermore, a channel attention mechanism is employed to analyze the similarity between global model updates from different clients and to calculate client similarity scores. The specific method is as follows:
[0029] The server retrieves local model parameters from each client, denoted as P. a =[p 1a p 2a ,...,p na ], a∈{1,2,...,m}, where m is the model layer number; the server calculates the average value of the model parameters for each layer on all clients, and calculates the difference between the model parameters for each layer on different clients and the average value of the model parameters for that layer, forming an n-dimensional parameter vector P′ of the model parameters for that layer. a ;
[0030] P′ a The similarity vector att is obtained by inputting the channel attention mechanism. a =(x 1a x 2a ,...,x na );
[0031] Based on the similarity vector att a To evaluate the contribution of each client using the entropy weight method, first calculate the contribution of each element x. ia proportionality coefficient and entropy
[0032] The entropy weight is calculated and normalized to obtain the client similarity score ψ(i), as shown in the following formula:
[0033]
[0034] To ensure that high weights are assigned to high-performing clients, ψ(i) is used with a function Classification, where AVG(x) is the function for calculating the average, and the judgment flag is ||h(Φ(i))⊙h(ψ(i))||. If Then ψ(i) is updated as follows: ψ(i)=AVG(ψ(i))-(ψ(i)-AVG(ψ(i))).
[0035] Furthermore, in the channel attention mechanism, the compression function uses discrete cosine transform for channel compression, and the input vector of this mechanism... Where C, L, and W represent the number of clients, the length of the weight parameter, and the width of the weight parameter, respectively; the input vector x is divided into n equal blocks, each block x i The corresponding Fra is obtained after discrete cosine transform. i These blocks are concatenated to form a complete compressed vector: Fra = compress(X) = cat([Fra1 Fra 2 ,...,Fra n The final channel attention mechanism framework is expressed by the formula: att = sigmod(fc(Fra(X))), where fc is a linear layer.
[0036] Furthermore, the server performs a weighted aggregation of global model parameters based on the client contribution score and similarity score, using the following formula:
[0037] W(i)=Φ(i)*ξ+ψ(i)*(1-ξ)
[0038] Where ξ is the scaling factor for adjusting the ratio of Φ(i) and ψ(i), and W(i) represents the weights when aggregating the global model.
[0039] Furthermore, when a client requests to forget, the process of dividing the client's dataset into a forgotten set and a retained set is as follows:
[0040] Data set G for each local client i Let it be represented as a set of triples G i ={(e, r, h)|e, h∈E i , r∈R i}, where E i It is an entity set, R i It is a set of relations;
[0041] After the client requests forgotten data, G i Divided into forget sets containing all requested forget data and its complement and retention set In order to forget the triplet Generate negative samples make sure Where k represents the number of negative samples.
[0042] Furthermore, the process for handling forgetting requests is as follows:
[0043] Calculate hard confusion loss The formula is:
[0044]
[0045] Calculate soft confusion loss The formula is:
[0046]
[0047] Where S is the function for calculating the triplet score, S (e,r,h) =-||e+rh||;
[0048] Calculate the cumulative interference loss by combining the losses from hard and soft obfuscation:
[0049]
[0050] Minimize the weighted local forgetting target loss and optimize the local model to make it suitable for the forgetting dataset. Minimize the loss on the hold set, while keeping the loss on the hold set. To maintain performance, the formula is as follows:
[0051]
[0052] in, Represents the local model M i In the dataset The loss function used for training, Represents the local model M i In the dataset The loss function used for training;
[0053] Minimize the sum of weighted local forgetting target losses to update the global model. Where G represents the entire dataset.
[0054] In the retention set Training is then performed on the model to restore its generalization ability.
[0055] The beneficial effects of this invention are as follows: This invention comprehensively improves the performance of federated graph learning, ensures fair evaluation of client contributions, and optimizes the collaboration process; it introduces attention mechanisms and entropy weighting methods to enhance the model's focus on key features, improving model convergence and adaptability; it incorporates forgetting theory to handle data forgetting requests and mitigates the impact of specific knowledge on the model through soft and hard obfuscation methods, thus restoring model performance. Overall, this invention improves the accuracy of federated graph learning on non-independent and identically distributed (i.i.d.) and imbalanced datasets, enhances user privacy protection, and improves model synchronization and optimization. Attached Figure Description
[0056] Figure 1 This is a schematic diagram illustrating the enhanced structured federated graph learning of this invention.
[0057] Figure 2 This is a sequence diagram of the federated graph learning and forgetting learning methods provided by this invention.
[0058] Figure 3 This is a schematic diagram of the customer contribution assessment method provided by the present invention.
[0059] Figure 4 This is a flowchart of the customer contribution evaluation algorithm provided by the present invention.
[0060] Figure 5 This is a schematic diagram of the parameter adjustment method provided by the present invention.
[0061] Figure 6 The flowchart of the parameter adjustment algorithm provided by this invention.
[0062] Figure 7 The figures show the accuracy of the global model provided by this invention on different datasets, where (a) to (d) represent the IMDB-BINARY dataset, YooChoose dataset, COLLAB dataset, and PROTEINS dataset, respectively.
[0063] Figure 8 The experimental diagram shows the impact of the number of clients on accuracy provided by this invention.
[0064] Figure 9 The figures show the accuracy of the global model provided by this invention on different forgetting sets, where (a) to (d) represent the IMDB-BINARY dataset, YooChoose dataset, COLLAB dataset, and PROTEINS dataset, respectively. Detailed Implementation
[0065] To make the objectives, technical solutions, and advantages of the present invention clearer, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
[0066] In distributed machine learning with large-scale graph data, federated graph learning can effectively solve the data silo problem. However, due to the non-independent and identically distributed nature of the data and the diversity of local model features, federated graph learning faces challenges in parameter processing, neighbor information integration, and parameter update mechanism optimization. In practical applications, such as traffic prediction systems for smart cities, traffic data from each city is collected independently and is unevenly distributed. By using an enhanced structured federated graph learning method, the quality and contribution of each city's data can be evaluated. Based on this, an attention mechanism is applied to analyze the similarity between cities, thereby optimizing the performance of the global model. When a city requests to delete its data, it can ensure that its data is forgotten in the global model while maintaining the overall performance of the model as much as possible. This makes this method the first federated graph learning algorithm capable of responding to client requests to revoke local data models, and provides an effective privacy protection solution while maintaining the accuracy of the global model.
[0067] like Figure 1 As shown, the enhanced structured federated graph learning comprises three main components: evaluating customer contributions, parameter readjustment, and knowledge forgetting learning. Figure 2 A sequence diagram for federated graph learning and forgetting learning methods. Specifically, it includes the following steps:
[0068] Step 1: Initialize the basic reputation score for each client. Calculate four metrics for each client: data value, effort level, risk assessment, and investment cost-effectiveness.
[0069] Step 2: Calculate the client's reputation score before each round of aggregation based on the aforementioned metrics. Then, the reputation score is converted into a contribution level using a normal distribution. The customer contribution level is then normalized to obtain the final contribution score Φ(i).
[0070] Step 3: Input the local model parameters of the client into the channel attention mechanism using discrete cosine transform to generate an attention vector; convert the attention vector into a client similarity score ψ(i) using the entropy weight method, while ensuring that high weights are assigned to clients with excellent performance.
[0071] Step 4: The server performs weighted aggregation of global model parameters based on the calculated client contribution score and similarity score: W(i)=Φ(i)*ξ+ψ(i)*(1-ξ), where ξ is the scaling factor for adjusting the ratio of Φ(i) and ψ(i), and W(i) represents the weight when aggregating the global model.
[0072] Step 5: When a client requests to forget, save the dataset G for each local client. i Considered as a triple (e, r, h); Transfer the client dataset G i Divided into a forget set containing all requested forget data. and its complement and retention set
[0073] Step 6: Combine the losses from hard and soft obfuscation to calculate the cumulative interference loss. Optimize local model M i Minimize the weighted local forgetting target loss; minimize the sum of the weighted local forgetting target losses, and update the global model M. global Appropriate learning retention set To restore the model's generalization ability.
[0074] Figure 3 , Figure 4 The diagram and flowchart are respectively shown for evaluating customer contribution methods. The specific process is as follows:
[0075] In step 1, the specific steps for evaluating the client's contribution are as follows:
[0076] Step 1.1: Initialize the basic reputation score.
[0077] For a set of federated graph learning clients N = {1, 2, 3, ..., n}, the server sets an initial basic reputation score for each client. in This represents the initial base reputation score of the i-th client.
[0078] Step 1.2: Calculate the value of the data.
[0079] In the t-th iteration, the accuracy of the local model is used. To quantify the data value of client i (v) i .
[0080] Step 1.3: Assess effort level.
[0081] Will The difference between the average accuracy of the local model in this round and the average accuracy of the local model is taken as the effort level e of each client i. i ;like A higher-than-average accuracy rate indicates that the client is performing well.
[0082] Step 1.4: Risk assessment.
[0083] Will Accuracy of the average aggregated global model The difference between them serves as a risk indicator g. i This helps to identify and mitigate the adverse effects of poor-quality models.
[0084] Step 1.5: Calculate the cost-benefit ratio of the investment.
[0085] The global model accuracy for the current round Compared with the previous round of global model accuracy The difference between them is used as the cost-benefit ratio of client i's investment. i Model performance typically improves gradually as training progresses, which usually indicates a positive return on investment.
[0086] In step 2, the contribution score is calculated as follows:
[0087] Step 2.1: Calculate reputation score.
[0088] Combining data value v i Effort level e i and cost-effectiveness of investment i These three metrics are used to calculate the reputation score for each client i before each round of aggregation. P represents the contribution cost ratio of client i. i and The difference between them is used to balance the impact of historical information and the number of current model updates, where P i =e i ×c i .
[0089] Step 2.2: Convert reputation score into contribution level.
[0090] The reputation score of client i in each round The contribution level is transformed using a normal distribution. The normal distribution is chosen because it imposes the fewest prior assumptions on the model. The contribution level is then transformed using a normal distribution. Divide into five different regions and assign reputation scores to each client i. The contribution level is assessed by multiplying it by its corresponding regional coverage area. (Variable) The probability of being within the range of μ±σ is 0.682, therefore a weighting factor is assigned. This represents the contribution level of client i; the formula for calculating the client's contribution level is:
[0091]
[0092] The formula for calculating the client contribution level integrated into risk assessment is as follows: Where g i Let φ(i) represent the risk assessment of client i; after normalizing φ(i), we obtain the contribution score Φ(i) of the client.
[0093] Figure 5 , Figure 6 The diagram and flowchart show the parameter adjustment method, and the specific steps are as follows:
[0094] In step 3, the specific steps for calculating the client similarity score are as follows:
[0095] Step 3.1: The server retrieves the local model parameters from each client, denoted as P. a =[p 1a p 2a ,...,p na ], a∈{1,2,...,m}, where m is the model layer number; the server calculates the average value of the model parameters for each layer on all clients, and calculates the difference between the model parameters for each layer on different clients and the average value of the model parameters for that layer, forming an n-dimensional parameter vector P′ of the model parameters for that layer. a .
[0096] Step 3.2: Place P′ a The similarity vector att is obtained by inputting the channel attention mechanism. a =(x 1a x 2a ,...,x na ).
[0097] In the channel attention mechanism, the compression function uses discrete cosine transform for channel compression, and the input vector of this mechanism... Where C, L, and W represent the number of clients, the length of the weight parameter, and the width of the weight parameter, respectively; the input vector x is divided into n equal blocks, each block x i The corresponding Fra is obtained after discrete cosine transform. i These blocks are concatenated to form a complete compressed vector: Fra = compress(X) = cat([Fra 1 Fra 2 ,...,Fra n The entire channel attention mechanism framework can be represented by the formula: att = sigmod(fc(Fra(X))), where fc is a linear layer.
[0098] Step 3.3: Based on the similarity vector att a To evaluate the contribution of each client using the entropy weight method, first calculate the contribution of each element x. ia proportionality coefficient and entropy
[0099] Step 3.4: Calculate the entropy weight and normalize it to obtain the client similarity score ψ(i), as shown in the following formula:
[0100]
[0101] Step 3.5: To ensure that high weights are assigned to high-performing clients, ψ(i) is used with a function Classify and determine the flag = ||h(Φ(i))⊙h(ψ(i))||, if Then ψ(i) is updated as follows: ψ(i)=AVG(ψ(i))-(ψ(i)-AVG(ψ(i))).
[0102] Step 4: The server performs weighted aggregation of global model parameters based on the calculated client contribution score and similarity score: W(i)=Φ(i)*ξ+ψ(i)*(1-ξ), where ξ is the scaling factor for adjusting the ratio of Φ(i) and ψ(i), and W(i) represents the weight when aggregating the global model.
[0103] To demonstrate the effectiveness, the accuracy of the global model obtained using the method of this invention is compared with that of global models obtained using other comparative algorithms. For example... Figure 7As shown, the method of this invention is significantly superior to other comparative algorithms in classification tasks. Clearly, after approximately 30 rounds, this method consistently maintains a high level of accuracy. Other algorithms exhibit lower accuracy and slower convergence. The success of this invention can be attributed to its design, which fully considers the accuracy of local models on each client and effectively compensates for the poorly scored model parameter values uploaded by clients during the aggregation of the global model. YooChoos e The dataset contains a large number of data samples, and the method of this invention consistently demonstrates superior performance compared to other algorithms. Notably, in the absence of real-world graph data, this method shows a significant accuracy advantage when applied to datasets such as PROTEINS and IMDB-BINARY, even with a small number of data samples. When evaluated on the COLLAB dataset, this method outperforms the Fedavg and FedGCN algorithms by 11.57% and 9.67%, respectively.
[0104] To verify whether the method of this invention can maintain a relatively stable accuracy when the number of clients increases, its performance was compared with that of existing benchmark algorithms under different numbers of clients. Figure 8 As shown, the method of this invention excels in processing heterogeneous data due to its flexibility and adaptability in dynamically changing data environments. Through contribution score calculation, similarity analysis can more accurately understand and utilize the unique characteristics of each client, thereby making the global model more powerful and effectively addressing the challenges of various data distributions.
[0105] The specific steps of step 5 are as follows:
[0106] Step 5.1: Dataset G for each local client i It can be represented as a set of triples D i ={(e, r, h)|e, h∈E i , r∈R i}, where E i It is an entity set, R i It is a set of relations.
[0107] Step 5.2: After the client requests the forgotten data, G i Divided into forget sets containing all requested forget data and its complement and retention set In order to forget the triplet Generate negative samples make sure
[0108] The specific steps of step 6 are as follows:
[0109] Step 6.1: Calculate the hard confusion loss, using the following formula:
[0110]
[0111] The formula for calculating soft confusion loss is:
[0112]
[0113] Where S is the function for calculating the triplet score, S (e,r,h) =-||e+rh||.
[0114] Step 6.2: Combine the losses from hard and soft obfuscation to calculate the cumulative interference loss:
[0115]
[0116] Step 6.3: Optimize the local model Make it in the forgotten dataset Minimize the loss on the hold set, while keeping the loss on the hold set. To maintain performance, the sum of weighted local forgetting target losses is minimized to update the global model. Where G represents the entire dataset.
[0117] Step 6.4: In the retained set Training is performed on the model to restore its generalization ability.
[0118] To delve deeper into the impact of the forgotten data ratio on the model, it's important to understand that as the forgotten data ratio increases, the model's accuracy inevitably suffers. Evaluations on four different datasets demonstrate that this forgetting algorithm exhibits superior performance. Both Fastretraining and FedRetraining achieve this by re-initializing and retraining the model, ensuring that the impact of forgotten data on the FL (Federated Learning) model is completely ignored. Compared to two baselin... e The proposed forgetting algorithm integrates backtracking interference and passive decay, employing a hybrid approach of soft and hard obfuscation. Specific knowledge is removed from the local client and passed to the global model, subsequently undergoing a correction operation based on the updated contribution score set. For example... Figure 9 As shown, based on the performance on the four datasets, on average, the accuracy of this algorithm on the forgotten dataset is 13.42% lower than RapidRetraining and 14.39% lower than Fedretraining, which indicates that the algorithm achieves a better forgetting effect.
[0119] The enhanced structured federated graph learning method described in this invention comprises three main parts: customer contribution evaluation, parameter tuning, and knowledge forgetting. Customer contribution evaluation utilizes reputation theory to assess customer contributions. Parameter tuning employs an attention mechanism to adjust key parameters, reducing aggregation errors and optimizing global model performance. Knowledge forgetting handles data forgetting requests by combining soft and hard obfuscation losses, ensuring that specific knowledge is forgotten without affecting overall performance. In summary, this invention not only enhances model synchronization and optimization but also provides an effective solution for privacy protection in federated learning environments.
[0120] Finally, it should be noted that the above embodiments are intended to illustrate the technical solutions of the present invention and do not constitute any limitation on the present invention. Those skilled in the art should fully understand that modifications to the technical solutions described in the foregoing embodiments or equivalent substitutions for any part or all of the technical features are entirely feasible. Such modifications or substitutions, as long as they do not depart from the scope of protection defined by the claims of the present invention, should be considered reasonable extensions of the present invention.
Claims
1. An enhanced structured federated graph learning method, characterized in that, include: Initialize a basic reputation score for each client, and calculate four indicators for each client: data value, effort level, risk assessment, and cost-effectiveness of investment. The reputation score of the client before each round of aggregation is calculated based on the aforementioned indicators. The reputation score is then converted into a contribution level using a normal distribution, and the contribution level is normalized to obtain the final contribution score. The client's local model parameters are input into a channel attention mechanism using discrete cosine transform to generate an attention vector. This attention vector is then converted into a client similarity score using entropy weighting, ensuring that high weights are allocated to high-performing clients. Specifically: The server retrieves local model parameters from each client, denoted as... , The number of model layers; the server calculates the average value of the model parameters for each layer on all clients, and then calculates the difference between the model parameters for each layer on different clients and the average value of the model parameters for that layer, forming the model parameters for that layer. n dimensional parameter vector ; Will The similarity vector is obtained by inputting it into the channel attention mechanism. In the aforementioned channel attention mechanism, the compression function uses discrete cosine transform for channel compression, and the input vector of this mechanism... ,in C, L, W These represent the number of clients, the length of the weight parameters, and the width of the weight parameters, respectively; the input vector... Divide into equal parts n Each block The corresponding result is obtained after discrete cosine transform. These blocks are then concatenated to form a complete compressed vector: The final framework of the entire channel attention mechanism can be expressed by the following formula: ,in fc Linear layer; Based on similarity vector The contribution of each client is evaluated using the entropy weight method. First, each element is calculated. proportionality coefficient and entropy ; Calculate the entropy weights and normalize them to obtain the client similarity score. The formula is as follows: To ensure that high weights are allocated to high-performing clients, Use functions Classification, among which The function for calculating the average value, and the judgment flag. ,like ,but The following updates will be made: ; The server performs a weighted aggregation of global model parameters based on the client contribution score and similarity score; When a client makes a forgetting request, each local client's dataset is treated as a triple; the client dataset is divided into a forgetting set containing all requested forgetting data and a complement set, the retaining set. Combine hard and soft obfuscation losses to calculate the cumulative interference loss; minimize the weighted local forgetting target loss to optimize the local model; minimize the sum of weighted local forgetting target losses to update the global model; train on the retained set to restore the model's generalization ability.
2. The enhanced structured federated graph learning method according to claim 1, characterized in that, The initialization of each client's basic reputation score and the calculation of the four indicators specifically include: For a set of federated graph learning clients The server sets an initial basic reputation score for each client. ,in Representing the The initial basic reputation score of each client; In the In each iteration, the accuracy of the local model is used. To quantify the client Data value ; Will The difference between the accuracy of the local model in this round and the average accuracy of the client model is used as the client's accuracy. effort level ; Will Accuracy of the average aggregated global model The difference between them serves as a risk indicator. ; The global model accuracy for the current round Compared with the previous round of global model accuracy The difference between them is used as the client Cost-effectiveness of investment .
3. The enhanced structured federated graph learning method according to claim 2, characterized in that, The contribution score is calculated as follows: Calculate each client before each round of aggregation. reputation score To balance the impact of historical information and the number of current model updates, among which , Indicates the contribution cost ratio. ; Clients are categorized using the normal distribution. i Reputation score per round Divided into five different regions, each client i Reputation score To assess the level of contribution, multiply it by its corresponding regional coverage area. ; The formula for calculating the client contribution level integrated into risk assessment is as follows: ;Will The client's contribution score is obtained after normalization. .
4. The enhanced structured federated graph learning method according to claim 3, characterized in that, The formula for calculating the client's contribution level is as follows: 。 5. The enhanced structured federated graph learning method according to claim 3, characterized in that, The server performs a weighted aggregation of global model parameters based on the client contribution score and similarity score, using the following formula: in, To adjust and The proportionality coefficient of the proportion, This represents the weights when aggregating the global model.
6. The enhanced structured federated graph learning method according to claim 1, characterized in that, When a client requests to forget, the process of dividing the client's dataset into a forgotten set and a retained set is as follows: Data sets for each local client Represented as a set of triples ,in It is an entity set. It is a set of relations; After the client requests the forgotten data, Divided into forget sets containing all requested forget data and its complement and retention set In order to forget the triplet Generate negative samples ,make sure ,in This indicates the number of negative samples.
7. The enhanced structured federated graph learning method according to claim 6, characterized in that, The process of handling forgetting requests is as follows: Calculate hard confusion loss The formula is: Calculate soft confusion loss The formula is: in, A function for calculating triplet scores, ; Calculate the cumulative interference loss by combining the losses from hard and soft obfuscation: Minimize the weighted local forgetting target loss and optimize the local model to make it suitable for the forgetting dataset. Minimize the loss on the hold set, while keeping the loss on the hold set. To maintain performance, the formula is as follows: in, Represents the local model In the dataset The loss function used for training, Represents the local model In the dataset The loss function used for training; Minimize the sum of weighted local forgetting target losses to update the global model. ,in G Represents the entire dataset; In the retention set Training is then performed on the model to restore its generalization ability.