A rumor propagation prediction method across domains based on transfer learning

By constructing a user topic network based on transfer learning, extracting features using Node2Vec and BERT models, and combining the G-GCN model and evolutionary game theory, the problems of data sparsity and feature complexity in rumor propagation in social networks are solved, achieving highly accurate prediction of rumor propagation.

CN115511181BActive Publication Date: 2026-06-26CHONGQING UNIV OF POSTS & TELECOMM

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHONGQING UNIV OF POSTS & TELECOMM
Filing Date
2022-09-27
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Rumors spread rapidly and have a wide impact on social networks. Existing models suffer from insufficient predictive ability due to the scarcity of effective data in the early stages of rumor topics, the complexity and diversity of feature spaces, and the coexistence and competition between rumors and debunking messages.

Method used

This study employs a transfer learning-based approach to construct a user topic network by acquiring topic data. Feature vectors are extracted using the Node2Vec algorithm and the BERT model. Evolutionary game theory is then used to quantify the spread of rumors and debunking information. The G-GCN model is trained and adapted through transfer learning to predict user dissemination behavior.

Benefits of technology

It improves the accuracy of rumor topic prediction, solves the problems of data sparsity and cross-domain adaptation, and enhances the model's ability to predict rumor spread on social networks.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115511181B_ABST
    Figure CN115511181B_ABST
Patent Text Reader

Abstract

The application belongs to the field of social network modeling and information dissemination, and particularly relates to a rumor dissemination prediction method based on transfer learning across fields; the method comprises the following steps: obtaining topic data and constructing a user topic network; defining a random walk strategy, and using a Node2Vec algorithm to represent the user topic network as a feature vector matrix; using a BERT model to represent topic text information to obtain a text feature vector of the topic; using evolutionary game theory to measure the dissemination influence of rumor information and rumor-busting information; training a prediction model according to the feature vector matrix, the text feature vector, the rumor information dissemination influence and the rumor-busting information dissemination influence, and adjusting the model parameters using a new model loss function; using transfer learning to correct the prediction model, and using the corrected model to predict user rumor dissemination topics; the application has high prediction accuracy, is conducive to inhibiting rumor information or disseminating rumor-busting information, and has high practicability.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of social network modeling and information dissemination, specifically involving a cross-domain rumor propagation prediction method based on transfer learning. Background Technology

[0002] In recent years, social networks have developed rapidly, drastically changing people's lifestyles. People can express their opinions, record their lives, and access information and knowledge through more diverse channels. However, at the same time, because this information is disseminated without the verification of traditional media, rumors have become more widespread. Rumors refer to statements or interpretations that spread without verification and have a certain social impact. These statements or interpretations generally fabricate false information through methods such as fabrication, subjective concoction, misrepresentation, and historical distortion. Compared to rumors in everyday society, rumors on social networks are characterized by rapid spread, wide impact, and numerous uncontrollable factors. Their large-scale spread not only affects people's social network experience but also causes misunderstandings or triggers negative emotions among the public, even increasing the likelihood of cybercrime and affecting social stability. Therefore, how to use technology to predict and interfere with the spread of rumors on social networks is crucial for guiding correct public opinion, timely identifying and resolving social problems, and creating a healthy and positive public opinion environment, and has significant application value and social significance.

[0003] Many scholars have conducted extensive research on topic propagation prediction models and achieved considerable results, but some problems still exist:

[0004] 1. Sparse effective data in the early stages of rumor topics. When rumors first break out, although the amount of data is huge, most of it is invalid data related to dissemination. This results in too little data that is truly helpful to the model, which has a significant impact on the model's data input.

[0005] 2. The feature space of rumor topics is complex and diverse. The feature space of rumor topics needs to take into account many influencing factors such as users' past behavior, users' network topology, and the characteristics of the rumors themselves, which brings difficulties to feature compression and effective expression.

[0006] 3. The coexistence and competition between rumors and debunking information. During the spread of rumors, user behavior is influenced by the outcome of the interaction between these two factors. Quantifying their influence on rumor topics is a crucial aspect of improving model predictive capabilities. Summary of the Invention

[0007] To address the shortcomings of existing technologies, this invention proposes a cross-domain rumor propagation prediction method based on transfer learning, which includes:

[0008] S1: Acquire topic data and construct a user topic network based on the topic data. The topic data includes topic dissemination information, user basic information, and topic text information.

[0009] S2: Extract relevant attributes from topic data, including user attributes, user's historical forwarding rate, and user interaction level;

[0010] S3: Define the random walk strategy based on the relevant attributes;

[0011] S4: Based on the random walk strategy, the Node2Vec algorithm is used to represent the user topic network as a low-rank dense feature vector matrix;

[0012] S5: Use the BERT model to represent the topic text information and obtain the topic text feature vector;

[0013] S6: Use evolutionary game theory to measure the spread and impact of rumors and debunking information;

[0014] S7: Train the G-GCN-based rumor-debunking information dissemination group behavior prediction model based on the feature vector matrix, text feature vector, influence of rumor information dissemination, and influence of debunking information dissemination, and adjust the model parameters using the new model loss function to obtain the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model.

[0015] S8: Use transfer learning to revise the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model to obtain the final G-GCN-based rumor-debunking information dissemination group behavior prediction model, and use this model to predict user-spreaded rumor topics.

[0016] The preferred formula for extracting a user's historical forwarding rate is:

[0017]

[0018] Among them, Rate(u i ) represents user u i Historical forwarding rate, browser(u i ) represents user u i The number of Weibo views, forward(u i ) represents user u i The number of retweets on Weibo.

[0019] The preferred formula for extracting user interaction is:

[0020]

[0021] Among them, intract(ui u j ) represents user u i For user u j Interactivity, I i,j Indicates user u i Do you pay attention to user u? j K represents user u i The total number of Weibo posts, b represents the behavior category, t represents the time of the current trending topic, t k Indicates user u i The time of posting the kth Weibo post kb Indicates user u i Is it for user u? j Generates behavior.

[0022] Preferably, the process of defining a random walk strategy includes: calculating the similarity between the current node and the next node in the user's topic network based on the user's own attributes; calculating the edge weights between the current node and the next node in the user's topic network based on the user's interaction degree; and defining a random walk strategy based on the similarity and edge weights.

[0023] Furthermore, the random walk strategy is expressed as:

[0024]

[0025] Where p = (c i =x|c i-1 =w) represents the probability that node w will move to the next node x, α p,q (w, x) represents the weight adjustment parameter between node w and node x, and β(w, x) represents the similarity between node w and node x. Let z represent the edge weight between node w and node x, and z represent the normalization constant.

[0026] Preferably, the process of measuring the dissemination influence of rumors and debunking information includes: calculating internal influencing factors based on user attributes and historical forwarding rates; calculating external influencing factors based on user interaction and topic text corpus; constructing influence functions for rumors and debunking information based on internal and external influencing factors; defining payoff functions for rumors and debunking information using evolutionary game theory based on their influence functions; and calculating the dissemination influence of rumors and debunking information based on their payoff functions.

[0027] Furthermore, the formula for calculating the dissemination impact of rumors and debunking information is as follows:

[0028]

[0029]

[0030] Among them, MutualInf rumor (u i This indicates that the rumor information affected user u. i The influence of MutualIn fanti-rumo (u i This indicates that the debunking information is relevant to user u. i Ben's influence on dissemination rumor (u i ) represents user u i Ben's profits from rumors anti-rumor (u i ) represents user u i Benefits from debunking information.

[0031] The preferred prediction model for rumor-debunking information dissemination group behavior based on G-GCN is expressed as follows:

[0032]

[0033] Where Z represents the probability value of different node categories, X represents the topic network feature input matrix, and A represents the adjacency matrix after user interaction behavior completion. ReLU() represents the intermediate parameter, and ReLU() represents the activation function. i Let be the weight matrix corresponding to the i-th layer in the graph convolutional network.

[0034] Preferably, the formula for the loss function of the new model is:

[0035] L=-θ D y log p(Y=0|X)-θ R y log p(Y=1|X)-θ A y log p(Y=-1|X)

[0036]

[0037]

[0038]

[0039] Where, θ D Let θ represent the first weight, y represent the indicator variable, Y represent the output category, X represent the topic network feature input matrix, and θ represent the output category. R θ4 represents the second weight, p(Y=i|X) represents the predicted probability that the output category Y belongs to label i, θ4 represents the third weight, D represents the number of users in the training set who do not forward messages, R represents the number of users in the training set who forward rumor messages, and A represents the number of users in the training set who forward debunking messages.

[0040] The beneficial effects of this invention are as follows:

[0041] 1. This invention addresses the problem of sparse effective data for rumors by utilizing transfer learning to select limited data from a source domain with richer data to compensate for the model input. Simultaneously, considering the differences in the feature spaces between the source and target domains, transfer learning is used to achieve domain adaptation. Based on transfer learning, the problems of sparse effective data and cross-domain adaptation for rumor topics are solved.

[0042] 2: To address the high dimensionality and complexity of the feature space of rumor topics, we utilize the powerful encoding capabilities of the BERT model to represent rumor text data. At the same time, considering the complexity of social network topology, we use the Node2Vec algorithm to extract user network topology. We propose the BERT-Node2Vec method to represent the feature space of rumor topics, resulting in a more complete, accurate representation with low vector dimensionality.

[0043] 3. Considering the impact of rumors and debunking messages on the spread of rumor topics, evolutionary game theory is introduced to quantify their influence. Furthermore, combining the characteristics of social network topology, a prediction model for the group behavior of rumor-debunking information dissemination based on G-Gcn is proposed, demonstrating high accuracy in prediction results. Attached Figure Description

[0044] Figure 1 This is a schematic diagram of the cross-domain rumor propagation prediction method based on transfer learning in this invention.

[0045] Figure 2 This is a schematic diagram illustrating the use of the Node2Vec algorithm and the Bert model to represent hidden information in this invention;

[0046] Figure 3 A schematic diagram for adapting the model to the domain. Detailed Implementation

[0047] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0048] This invention proposes a cross-domain rumor propagation prediction method based on transfer learning, such as... Figure 1 As shown, the method includes the following:

[0049] S1: Acquire topic data and construct a user topic network based on the topic data. The topic data includes topic dissemination information, user behavior information, and topic text information.

[0050] Acquire topic data (user behavior data) online. Data can be obtained from publicly available data websites or through mature social network public APIs. The required data includes information on all participants and the topic itself throughout its lifecycle. Topic data includes topic dissemination information, basic user information, and topic text information. Topic dissemination information includes the time when the topic was forwarded, commented on, viewed, and liked by users, as well as the topic's publication time. Basic user information includes relationship information between users (including following and being followed) and user attribute information (user nickname, registration information, number of followers, number of friends, etc.). Topic text information includes the topic content and user responses such as forwarding and commenting.

[0051] Based on the topic data, define the topic user groups and the directed edge network of users, and construct the user topic network, which can be represented as:

[0052]

[0053] in, Indicates the user relationship network of the topic, u t E represents users who participated in trending topics within the time period t. u Indicates the relationship between users of a topic. U represents the set of relationship edges between users of a topic, where each edge represents a follower-follower relationship between users.

[0054] S2: Extract relevant attributes (rumor topic features) based on topic data. Relevant attributes include user attributes, user's historical forwarding rate, and user interaction level.

[0055] User attributes include personal information registered by the user. Examples include the user's nickname, registration information (including gender and location), number of followers, and number of friends. This personal information is closely related to whether the user participates in the spread of this rumor. Therefore, user attributes are defined as follows:

[0056] User(u i ) = [information(u i ), fans(u i ), follow(u i )]

[0057] Among them, User(u i ) represents user u i Its own properties, information(ui ) represents basic user information (user nickname, registration information, etc.), fans(u i ) represents user u i The number of followers, follow(u i This indicates the number of a user's friends (the number of users who follow each other).

[0058] A user's historical forwarding rate can reflect their activity level in a topic to some extent and is helpful in predicting user behavior. This invention defines the historical forwarding rate as:

[0059]

[0060] Among them, Rate(u i ) represents user u i Historical forwarding rate, browser(u i ) represents user u i The number of Weibo views, forward(u i ) represents user u i The number of retweets on Weibo.

[0061] Whether a user will forward a rumor is influenced by their own activity level and the level of interaction among users. A user's historical forwarding rate represents their own activity level, while the level of interaction among users is represented by:

[0062]

[0063] Among them, intract(u i u j ) represents user u i For user u j Interactivity level; I i,j Indicates user u i Do you pay attention to user u? j If you are interested, then I i,j =1, otherwise, I i,j =0; K represents user u i The total number of Weibo posts, b represents the behavior category (b=1 for reposts, b=2 for comments, b=3 for likes), t represents the time of the current trending topic, t k Indicates user u i The time when the kth Weibo post was published; Weibo kb This represents user x and user w based on behavior b (forwarding, commenting, liking). i The value of the kth Weibo post is manually set based on the importance of different behaviors in spreading rumors. For example, 6 is taken for forwarding, 4 for commenting, and 2 for liking.

[0064] S3: Define the random walk strategy based on the relevant attributes.

[0065] In the topic propagation space, both the individual attributes of nodes and the relationships between nodes influence the behavior of the group. Existing network representations mostly consider only structure and homogeneity, often neglecting the correlation between node attributes. Therefore, this invention leverages the advantages of the Node2vec (random walk) algorithm in network topology representation, and combines node attribute information to define a topic structure feature representation method.

[0066] The algorithm aims to learn the eigenvectors of network nodes under hot topics. The input is a weighted network topology, and the output is the vector representation of each node. The objective function of the algorithm is defined as:

[0067]

[0068]

[0069] Where Ns(w) represents the neighbors of node w obtained through sampling strategy S, Pr(Ns(w)|f(w)) represents the probability that node vector f(w) contains a neighboring node, and n j Pr(n) represents a node j among the neighbors of node w. j |f(w)) represents the logarithmic probability of observing the neighborhood of node w based on the feature representation of node j.

[0070] To achieve the above objectives, considering the influencing factors of node attributes, this invention redefines the node walking strategy; specifically, the starting node is defined as c0, the i-th node of the random walk is defined as c1, and the process of defining the random walk strategy includes:

[0071] The similarity between the current node and the next node in the user topic network is calculated based on the user's own attributes. The similarity between nodes β(w, x) represents the similarity of attributes between the current node w and the next node x. This invention selects some information from the user's own attributes, such as the user's gender, location information, number of followers, etc., and maps them to the feature vector space, using the Euclidean distance between vectors to represent the similarity.

[0072] Since the interaction intensity between potential users and trending users influences their tendency to participate in trending topics, a factor is used to assign edge weights to the topic network based on user interaction. The formula for calculating the edge weight between the current node and the next node in the user topic network based on user interaction is as follows:

[0073]

[0074]

[0075] Where p represents the first hyperparameter, q represents the second hyperparameter, and d w,x Let w represent the distance between node w and node x, intract(wi, x) represent the interaction degree between the i-th node w and node x, v represent the edge between nodes, and u represent the user node.

[0076] Based on similarity and edge weights, a random walk strategy is defined, and the transition probability of a node is expressed as:

[0077]

[0078] Where p = (c i =x|c i-1 =w) represents the probability that node w will move to the next node x, α p,q (w, x) represents the weight adjustment parameter between node w and node x, and β(w, x) represents the similarity between node w and node x. Let z represent the edge weight between node w and node x, and z represent the normalization constant.

[0079] The Node2Vec algorithm and BERT model are used to extract hidden information from topics, such as... Figure 2 As shown, it includes the following:

[0080] S4: Based on the random walk strategy, the Node2Vec algorithm is used to represent the user topic network as a low-rank dense feature vector matrix.

[0081] Based on the random walk strategy, the topic topology is represented as a low-rank dense eigenvector matrix E using the Node2Vec algorithm. s ∈R N*d .

[0082] S5: Use the BERT model to represent the topic text information and obtain the topic text feature vector.

[0083] The BERT model recognizes the importance of bidirectional pre-training for topic representation. It uses a masked language model to achieve deep bidirectional representation during pre-training, which simultaneously considers and decomposes the contextual relationships of the input topic, embedding these relationships into each word vector. Furthermore, during sentence decomposition, the BERT model introduces the relative position of each word within the sentence as a benchmark to more comprehensively analyze the relationships between words, resulting in sentence vectors that are more consistent with context and logic.

[0084] First, based on UInfo (a paragraph or a sentence), give S = {s1, s2, s3, ..., s...} |S|} represents the original topic group. The text information in the rumor topic is segmented into words, and stop words are removed to reduce the noise generated by stop words on subsequent models. Since the messages in the topic are of medium length, and punctuation marks have a certain impact on the severity and veracity of the content expressed in the text, this invention retains regular punctuation marks as part of the text features; each s i ∈S are short texts composed of character sequences. l is S i Length;

[0085] The processed text corpus of the topic can be represented as Iattribute(I i ), I i Representing the i-th topic; the processed text corpus is input into the BERT model for representation, and the BERT model outputs a text feature vector V containing the current topic information. text ∈R d .

[0086] S6: Use evolutionary game theory to measure the spread and impact of rumors and debunking information.

[0087] The internal influencing factors of a user are calculated based on the user's own attributes and historical forwarding rate. The calculation formula is as follows:

[0088] Infactors(u i =User(u i )*Rate(u i )

[0089] Among them, Infactors(u i ) indicates the user's own attitude towards user u i Internal influencing factors.

[0090] External influencing factors of users are calculated based on user interaction and topic text corpus. The calculation formula is as follows:

[0091] Outfactors(u i =intract(u i ,x)*Iattribute(l i )

[0092] Among them, Outfactors(u i ) indicates that the topic information is relevant to user u i External influencing factors, Iattribute(l i ) represents the basic information attributes of the i-th topic, i.e., the text corpus of the topic.

[0093] Based on internal and external user influencing factors, the influence functions for rumor information and debunking information are constructed using a multiple linear regression algorithm as follows:

[0094] Iinfluence rumor (u i )=ρ0+ρ1*Infactors(u i )+ρ2*Outfactors rumor (u i )

[0095] Iinfluence anti-rumor (u i )=ρ0+ρ1*Infactors(u i )+ρ2*Outfactors anti-rumor (u i )

[0096] Where ρ0, ρ1, and ρ2 are the first, second, and third partial regression coefficients obtained using the multiple linear regression algorithm, respectively. ρ1 and ρ2 reflect the weight proportions of internal and external factors in the information's influence. rumor (u i Outfactors anti-rumor (u i The symbols ) represent the impact of rumors and debunking information on user u. i External influencing factors.

[0097] Due to the complexity of social networks, the spread of rumors is often accompanied by debunking messages, such as official clarification announcements. These messages have a mutually reinforcing and antagonistic relationship, which is a significant factor influencing user forwarding behavior. Therefore, this invention uses evolutionary game theory to quantify the mutual influence of rumor and debunking information. First, two game strategies are defined: "forwarding rumor information" and "forwarding debunking information." Using evolutionary game theory, based on the influence functions of rumor and debunking information, the payoff functions for rumor and debunking information are defined. The payoff functions for the two strategies are as follows:

[0098] Ben rumor (u i ) = P1 × influence rumor (u i )

[0099] Ben anti-rumor (u i ) = P2 × influence anti-rumor (u i )

[0100] Among them, Ben rumor (u i ) represents user u i Ben's profits from rumors anti-rumor (u i ) represents user u i Benefits from debunking information; P1 and P2 are respectively for user u i The proportion of users spreading rumors and debunking information among friends and followers is calculated. User nodes that do not participate in forwarding messages do not affect other user nodes, so they are negligible. Therefore, P1+P2=1.

[0101] The dissemination influence of rumors and debunking information is calculated based on the revenue function of the rumors and debunking information. The calculation formula is as follows:

[0102]

[0103]

[0104] Among them, MutualInf rumor (u i This indicates that after a process of mutual interaction, the rumor information has a significant impact on user u. i The influence of MutualInf anti-rumo (u i This indicates that after a process of mutual interaction, the debunking information has a positive impact on user u. i Its influence on dissemination.

[0105] S7: The rumor-debunking information dissemination group behavior prediction model based on G-GCN is trained based on the feature vector matrix, text feature vector, influence of rumor information dissemination, and influence of debunking information dissemination. The model parameters are then adjusted using a new model loss function to obtain the trained rumor-debunking information dissemination group behavior prediction model based on G-GCN.

[0106] Given that social networks have a non-Euclidean network topology, traditional discrete convolution cannot adequately handle them. Graph Convolutional Neural Networks (GCNs), based on graph theory, implement convolution operations on graphs. Therefore, this invention chooses GCNs to process rumor-related data. Considering the competitive and cooperative relationship between rumor and debunking topics in their dissemination, a rumor-debunking information dissemination group behavior prediction model based on G-GCN (Geo-Game Theory-Based Graph Convolutional Neural Network) is proposed. The goal of the prediction task is to predict the participation of potential user nodes in the rumor topic. If a user participates, the model determines whether they will not forward the rumor, forward the rumor, or forward the debunking message, thus transforming the task into a three-class classification task.

[0107] The process of training the G-GCN-based rumor-debunking information dissemination group behavior prediction model based on the feature vector matrix, text feature vector, the influence of rumor information dissemination, and the influence of debunking information dissemination includes:

[0108] The edge weights between users in the user topic network are completed by adding the dimension of user interaction edge weights based on user interaction, resulting in the completed user topic network; the adjacency matrix A is obtained based on the completed user topic network.

[0109] For text feature vector V text The user's personal attributes and forwarding rate are concatenated to obtain the input feature dimension F of the user node; based on the feature vector matrix E... s The topic network feature input matrix is ​​obtained by combining the input feature dimensions: X = E s *F.

[0110] This invention uses a two-layer graph convolutional neural network with an intermediate dropout layer as a model for predicting the forwarding of online rumors. First, the weights W and bias b are randomly initialized. The topic network feature input matrix X and the adjacency matrix A are then input into the prediction model. X is multiplied by W, and the bias b is added before multiplying by... Multiplication; adding the rumor information and debunking information after mutual interaction to the user u i The influence of the spread of rumors is used as the dimension of influence. The ReLU function is used as the activation function for this layer. Dropout is performed during model training. Finally, the SoftMax activation function is used to represent the convolutional output as the probability value of different node categories, obtaining the prediction result of user-spreaded rumors topics. The rumor-debunking information dissemination group behavior prediction model based on G-GCN is expressed as follows:

[0111]

[0112] Where Z represents the probability value of different node categories, f(X, A) represents the user's behavior in the current topic, X represents the topic network feature input matrix, and A represents the adjacency matrix after user interaction behavior completion. The intermediate parameters are matrix vectors calculated from the adjacency matrix. D represents the training parameters; ReLU() represents the activation function, w i Let be the weight matrix corresponding to the i-th layer in the graph convolutional network.

[0113] Since this invention discusses a three-class classification prediction problem, let the model output be Z = P(r, a, d|u i The specific definition is as follows:

[0114]

[0115] Wherein, p(r|u i ) represents user u i The probability of forwarding a rumor, (P(r, a, d|u) i )) represents user u i The final action (r represents forwarding the rumor, d represents forwarding the debunking information, and a represents not forwarding the rumor topic), p(d|u i ) represents user u i The probability of not participating in rumors, p(a|u i ) represents user u i The probability of debunking the rumor. If Y = 1, then determine the potential user u. i The rumor will be forwarded in the next time period; if Y = -1, then potential user u will be identified. i The debunking information will be forwarded in the next time period; otherwise, potential users will be... i I will not participate in this rumor topic during the next time period.

[0116] Cross-entropy loss is commonly used in multi-class classification. However, we found that this method struggles to learn positive instances (forwarding of rumors and debunking messages) within a rumor topic. This is because there is an imbalance in the different labeled samples within a rumor topic. During training, the loss function tends to favor the side with more samples. Although the training loss is small, the model is prone to overfitting and has low accuracy in identifying categories with fewer samples. Therefore, this invention redefines a new model training loss function to alleviate this problem; the formula for the new model loss function is:

[0117] L=-θ D y log p(Y=0|X)-θ R y log p(Y=1|X)-θ A y log p(Y=-1|X)

[0118]

[0119]

[0120]

[0121] Where y represents an indicator variable, which is 1 if the category matches the sample's category, and 0 otherwise; Y represents the output category; X represents the topic network feature input matrix; and θ represents the output category. D θ represents the first weight. R Let θ represent the second weight, p(Y=i|X) represent the predicted probability that the output category Y belongs to label i, and θ represent the predicted probability that the output category Y belongs to label i. ALet |M| represent the third weight, D represent the number of users in the training set who do not forward messages, R represent the number of users in the training set who forward rumor messages, and |M| represent the number of users in the training set who forward debunking messages.

[0122] By adjusting the model parameters using a new model loss function, a well-trained prediction model for the spread of rumors and debunking information based on G-GCN is obtained.

[0123] S8: Use transfer learning to revise the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model to obtain the final G-GCN-based rumor-debunking information dissemination group behavior prediction model, and use this model to predict user-spreaded rumor topics.

[0124] Social media platforms are rife with various rumors, each potentially focusing on different aspects. The core terminology differs significantly across different topic areas, such as politics or food safety, meaning the distribution of rumor characteristics and the users sharing them also vary. After a deep learning model completes its training, the topic feature space, distribution, and inherent geometric information are fixed into the model. Based on these facts, when training data for the rumor topic to be predicted is insufficient, directly using the previously proposed basic model inevitably leads to a loss of prediction accuracy. This invention proposes a model domain adaptation scheme based on graph structure and parameter transfer, such as... Figure 3 As shown.

[0125] This study attempts to overcome the prevailing theory that training and testing data should come from the same feature space and distribution through transfer learning. Transfer learning between different rumor topic domains can alleviate the burden of training models for new topics. Given the importance of topic structural features in graph analysis, the core is to transfer the structural features learned by the base model from the source domain to the target domain, and to fine-tune the parameters of each layer based on the features of the topic itself.

[0126] The domain in transfer learning consists of the feature space and probability distribution Composition, in which Given a domain It can be used To represent a task, it has a label space. and from training data Prediction function learned from Where x∈X, The general goal of transfer learning is to leverage the source domain... Heyuan Mission Use knowledge from the target domain to improve Prediction tasks in Next, the source topic domain graph learned will be transferred. The inherent geometric information in and The knowledge within.

[0127] Therefore, it is possible to skip in China generate The steps and the steps for extracting structural features. In kind In cases of structural similarity, replication is possible. Trained in China The convolutional and pooling layers with the characteristics of the target are used to train the system. middle The model fine-tunes the parameters and weights. This transfer learning approach improves learning efficiency and helps minimize problems caused by a lack of data and incomplete structural information for new tasks.

[0128] After using transfer learning to modify the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model, the final G-GCN-based rumor-debunking information dissemination group behavior prediction model is obtained.

[0129] Obtain topic data for the topic to be predicted and perform the data processing steps S1 to S6 above; based on the processing results, use the final G-GCN-based rumor-debunking information dissemination group behavior prediction model to predict rumor dissemination and obtain the prediction results of user-spreaded rumor topics; based on the prediction results of user-spreaded rumor topics, relevant users can be monitored and effective measures can be taken to achieve the purpose of suppressing rumor dissemination.

[0130] The above-described embodiments further illustrate the purpose, technical solution, and advantages of the present invention. It should be understood that the above-described embodiments are merely preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made to the present invention within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A cross-domain rumor propagation prediction method based on transfer learning, characterized in that, include: S1: Acquire topic data and construct a user topic network based on the topic data. The topic data includes topic dissemination information, user basic information, and topic text information. S2: Extract relevant attributes from topic data, including user attributes, user's historical forwarding rate, and user interaction level; S3: Define the random walk strategy based on the relevant attributes; The process of defining a random walk strategy includes: calculating the similarity between the current node and the next node in the user's topic network based on the user's own attributes; calculating the edge weights between the current node and the next node in the user's topic network based on the user's interaction degree; and defining the random walk strategy based on the similarity and edge weights. The random walk strategy is expressed as: ; in, This represents the probability that node w will move to the next node x. This represents the weight adjustment parameter between node w and node x. This represents the similarity between node w and node x. This represents the edge weight between node w and node x. Represents the normalization constant; S4: Based on the random walk strategy, the Node2Vec algorithm is used to represent the user topic network as a low-rank dense feature vector matrix; S5: Use the BERT model to represent the topic text information and obtain the topic text feature vector; S6: Use evolutionary game theory to measure the spread and impact of rumors and debunking information; S7: Train the G-GCN-based rumor-debunking information dissemination group behavior prediction model based on the feature vector matrix, text feature vector, influence of rumor information dissemination, and influence of debunking information dissemination. Adjust the model parameters using a new model loss function to obtain the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model. The formula for the new model loss function is: ; ; ; ; in, Indicates the first weight. Indicates an indicator variable. Indicates the output category. This represents the topic network feature input matrix. Indicates the second weight. Indicates the output category Belongs to the label The predicted probability, Indicates the third weight. This represents the number of users in the training set who do not forward messages. This indicates the number of users in the training set who forwarded false information. This indicates the number of users who forwarded debunking messages in the training set. S8: Use transfer learning to revise the trained G-GCN-based rumor-debunking information dissemination group behavior prediction model to obtain the final G-GCN-based rumor-debunking information dissemination group behavior prediction model, and use this model to predict user-spreaded rumor topics.

2. The method for predicting cross-domain rumor propagation based on transfer learning according to claim 1, characterized in that, The formula for extracting a user's historical forwarding rate is: ; in, Indicates user Historical forwarding rate Indicates user The number of Weibo views Indicates user The number of retweets on Weibo.

3. The method for predicting cross-domain rumor propagation based on transfer learning according to claim 1, characterized in that, The formula for extracting user interaction is: ; in, Indicates user For users Interactivity Indicates user Do you care about users? , Indicates user Total number of Weibo posts Indicates the category of behavior. Indicates the time period of current trending topics. Indicates user The time of posting the kth Weibo post Indicates user For users Generates behavior.

4. The method for predicting cross-domain rumor propagation based on transfer learning according to claim 1, characterized in that, The process of measuring the dissemination influence of rumors and debunking information includes: calculating internal user influencing factors based on user attributes and historical forwarding rates; calculating external user influencing factors based on user interaction and topic text corpus; constructing influence functions for rumors and debunking information based on internal and external user influencing factors; defining payoff functions for rumors and debunking information using evolutionary game theory based on their influence functions; and calculating the dissemination influence of rumors and debunking information based on their payoff functions.

5. The method for predicting cross-domain rumor propagation based on transfer learning according to claim 4, characterized in that, The formula for calculating the spread and impact of rumors and debunking information is as follows: ; ; in, This indicates that rumors are harmful to users. its influence on dissemination This indicates that the information debunking the rumors is helpful to users. its influence on dissemination Indicates user Benefits from rumors and misinformation Indicates user Benefits from debunking information.

6. The method for predicting cross-domain rumor propagation based on transfer learning according to claim 1, characterized in that, The prediction model for the behavior of rumor-debunking information dissemination groups based on G-GCN is expressed as follows: ; in, This represents the probability value of a node belonging to a different category. This represents the topic network feature input matrix. This represents the adjacency matrix after user interaction behavior completion. Indicates intermediate parameters. This represents the activation function. Let be the weight matrix corresponding to the i-th layer in the graph convolutional network.