Graph convolutional recommendation system fusing neighbor importance and feature learning
By employing neighbor importance sampling and feature cross-pooling strategies, the efficiency and accuracy issues of neighborhood aggregation in knowledge graph recommendation systems are addressed, resulting in more efficient recommendation performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- KUNMING UNIV OF SCI & TECH
- Filing Date
- 2022-05-27
- Publication Date
- 2026-06-16
Smart Images

Figure CN115048530B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of knowledge graph recommendation system technology, and in particular to a graph convolutional recommendation system that integrates neighbor importance and feature learning. Background Technology
[0002] With the rapid development of social media, personalized recommendation systems model user preferences for items, effectively addressing the information overload problem in the internet age. Recently, many scholars have introduced knowledge graphs into recommendation systems to model user interests. By mining the multi-hop relationships (i.e., paths) between users and interactive items in the knowledge graph, and extracting implicit user preferences and other auxiliary information, the system can more accurately learn the similarity between users and items, improving recommendation accuracy.
[0003] Existing recommendation models that integrate knowledge graphs can be broadly categorized into three types: embedding-based methods, path-based methods, and hybrid embedding-path methods. Among embedding-based methods, CoFM is a fusion recommendation model that combines the collaborative filtering model FM and the graph embedding model TransE. It introduces information about multiple entities and their relationships within the knowledge graph as effective auxiliary information into the recommendation system, effectively improving recommendation accuracy and alleviating the problem of sparse user historical interaction data. However, the graph embedding model TransE used in the CoFM model cannot effectively solve 1-N, N-1, and NN problems. To address this issue, FMH replaces the TransE model with the TransH model to improve CoFM, capturing the rich structure between multiple related entities and better modeling user interests. Embedding-based methods can embed corresponding attributes into the knowledge graph according to specific application scenarios, enriching the feature representation of entities; however, this also limits the applicable scenarios for this method. In path-based methods, to leverage heterogeneous information to mine users' higher-order interests, SAMREC proposed a personalized recommendation method based on semantic meta-paths. It introduces ratings to design meta-paths and uses weight regularization to measure the importance of each meta-path, capturing users' personalized weight preferences and alleviating the problem of sparsity in rating data. However, designing meta-paths requires rich expertise in relevant neighborhoods, placing high demands on the designer. PinSage combines efficient random walk strategies with graph convolution, using random walks for path selection to generate node (i.e., item) embeddings containing graph structure and node feature information. This method does not require manual design of meta-paths, but the random walk strategy introduced uncertainty into sampling. Hybrid models combining embeddings and paths can effectively address the problems of the above two methods. IPAKG introduces knowledge graphs to mine users' implicit preference expressions and combines recurrent neural networks and attention mechanisms to capture users' constantly changing interests and the relationships between different items in a sequence. However, when using knowledge graphs to mine user preferences, it does not distinguish the importance of different neighborhoods to entities, resulting in the selected neighbors not fully representing the neighborhood features of entities and potentially introducing invalid noise information. KGNN-LS uses a trainable function to compute a user's item embeddings by identifying important knowledge graph relationships for a given user. This transforms the knowledge graph into a user-specific weighted graph, and then applies a graph neural network to compute personalized item embeddings.
[0004] The aforementioned methods have two main problems: First, when using knowledge graphs to aggregate entity neighborhoods, an excessive number of neighbor nodes can introduce invalid information that affects recommendation results and increase computational overhead, consuming system resources. Existing models like KGCN use a "fixed neighborhood" sampling method, but this method cannot fully utilize all neighborhood information, resulting in an incomplete aggregation result. Furthermore, during training, as the order of entity features increases, the introduced noise and system parameters also increase, posing a risk of convergence failure. KGFER samples from the one-hop neighbors and relationships of items interacting with the user, uses a CNN to learn item features from entity relationships, then aggregates item features with interacting items using an MLP, and finally embeds the refined items into the user's latent space to predict the potential probability of user interaction with items. This method only samples the one-hop neighborhood and relationships of entities in the knowledge graph, failing to fully utilize the multi-hop higher-order relationships of entities in the graph to learn the user's potential remote interests, and thus does not directly address the aforementioned problems. Summary of the Invention
[0005] The purpose of this invention is to provide a graph convolutional recommendation system that integrates neighbor importance and feature learning to solve the above-mentioned problems.
[0006] The above-mentioned technical objective of the present invention is achieved through the following technical solution:
[0007] A graph convolutional recommendation system that integrates neighbor importance and feature learning includes the following steps:
[0008] (1) Neighbor sampling module
[0009] The knowledge graph consists of triples (h, r, t), where h represents the head entity, t represents the tail entity, and r represents the relationship between entities. When calculating the importance of entity nodes, user preference for those nodes is considered. The user's score is added to the score for the entity node based on their relationship score. The initial score for entity node i is:
[0010] s(i)=(1-α)s(u,r)+αs(u,v)
[0011] Where s(i) represents the initial score of entity node i, the first term represents the score of user node and relation, the second term represents the score of user node and entity i, and α is a hyperparameter used to measure the importance of the scores of user and relation, and user and entity.
[0012] In knowledge graphs, centrality can be used as an indicator to judge the importance or influence of nodes. Centrality can be further divided into degree centrality, betweenness centrality, and proximity centrality. Degree centrality measures the degree to which a node is connected to all other nodes in the graph; betweenness centrality characterizes the importance of a node by the number of shortest paths passing through it; and proximity centrality reflects the proximity of a node to other nodes in the graph. Based on the characteristics of knowledge graphs, the more nodes a node is connected to, the richer the implicit information it may contain. Therefore, this invention uses degree centrality to measure the importance of a node and assumes that the importance of an entity node is positively correlated with its centrality in the knowledge graph, that is, a more central node will be more important than other nodes. The centrality of entity node i is expressed as:
[0013] c(i) = log(d(i) + ε)
[0014] Where d(i) represents the in-degree of entity node i, and ε is a very small constant.
[0015] Finally, the final importance score of the entity node is obtained by combining the initial score and centrality of the entity node:
[0016] s(i)=σ s (c(i)·s(i))
[0017] Where, σ s Using a non-linear activation function, the neighborhood list of the target entity node is obtained by sorting the nodes according to their final importance scores.
[0018] (2) Aggregation method based on feature cross pooling
[0019] Introducing a feature cross-pooling layer to aggregate entity neighborhoods:
[0020] f Bi-Interaction =LeakyReLU(W1(e h +e Nh ))+
[0021] LeakyReLU(W2(e h ⊙e Nh )
[0022] Where W1, W2∈R d'×d It is a trainable weight matrix, e h e is the entity feature vector. Nh Let be the neighborhood feature vector of the entity, and ⊙ denote the element-wise product. The element-wise product operation for the k-th dimension is as follows:
[0023] (e h ⊙e Nh )k =e hk e Nhk
[0024] By performing pairwise crosses on all feature domains, a series of feature vectors after feature crosses can be obtained. Finally, sumpooling is performed on all results. When the model learns each feature, it needs to cross with other features. However, when the model learns gender features, it is inevitably affected by time features, which increases the computational load to a certain extent. Therefore, dropout is introduced to prevent overfitting.
[0025] (3) Rating prediction and model optimization
[0026] Use the user feature vector z obtained from the final aggregation u and project feature vector z i The dot product is used to obtain the user's predicted rating for the item:
[0027]
[0028] The model is updated using gradient descent, optimized using cross-entropy loss, and the cross-entropy loss function is used to calculate the prediction results. The smaller the distance between the prediction and the correct result y, the more accurate the prediction and the better the model performance.
[0029]
[0030] Compared with the prior art, the present invention has the following beneficial effects:
[0031] This invention addresses the above problems from two perspectives: neighborhood sampling and neighborhood aggregation. Based on the KGCN model, it proposes a neighbor importance sampling strategy and a feature cross-pooling strategy. When selecting neighbors, the importance of a node to the target node is obtained by using the scores of neighbor nodes and their centrality perception scores. Neighbors are ranked according to their scores before sampling. This approach can find the most valuable neighbors for entities by traversing the entire knowledge graph, which not only makes full use of the edge information of the knowledge graph but also overcomes the uncertainty caused by the random walk sampling method used in the past. In neighborhood aggregation, this invention uses Bi-Interaction for feature cross-pooling aggregation, which can not only learn the rich feature information implicit in the vectors but also reduce the generation of noise. Finally, the user feature vector and the aggregated entity feature vector are fed into the prediction function to predict the probability of user interaction with entity items. The improved model KGCN-NP of this invention was tested on the MovieLens-1M, Book-Crossing, and Last.FM datasets. The results show that the model of this invention has achieved effective improvements in AUC, Recall, and F1 scores compared to the baseline model. Attached Figure Description
[0032] Figure 1 This is a model architecture diagram of the present invention. Detailed Implementation
[0033] The present invention will be further described in detail below with reference to the accompanying drawings.
[0034] Example, refer to Figure 1 A graph convolutional recommendation system that integrates neighbor importance and feature learning includes the following steps:
[0035] (1) Neighbor sampling module
[0036] Knowledge graphs consist of triples (h, r, t), where h represents the head entity, t represents the tail entity, and r represents the relationship between entities. Most existing graph convolution algorithms select entity neighborhoods by calculating scores between users and relationships (e.g., the relationship between an actor and a suspense film), but they fail to consider users' preferences for the nodes themselves (e.g., users' preferences for suspense film genres). This easily introduces invalid information when learning the embedding of target items, affecting user preference learning and consequently impacting the system's recommendation performance. This paper considers users' preferences for entity nodes when calculating their importance, adding user-entity scores to the user-relationship scores. The initial score for entity node i is:
[0037] s(i)=(1-α)s(u,r)+αs(u,v)
[0038] Where s(i) represents the initial score of entity node i, the first term represents the score of user node and relation, the second term represents the score of user node and entity i, and α is a hyperparameter used to measure the importance of the scores of user and relation, and user and entity.
[0039] In knowledge graphs, centrality can be used as an indicator to judge the importance or influence of nodes. Centrality can be further divided into degree centrality, betweenness centrality, and proximity centrality. Degree centrality measures the degree to which a node is connected to all other nodes in the graph; betweenness centrality characterizes the importance of a node by the number of shortest paths passing through it; and proximity centrality reflects the proximity of a node to other nodes in the graph. Based on the characteristics of knowledge graphs, the more nodes a node is connected to, the richer the implicit information it may contain. Therefore, this paper uses degree centrality to measure the importance of a node and assumes that the importance of an entity node is positively correlated with its centrality in the knowledge graph, that is, a more central node will be more important than other nodes. The centrality of entity node i is expressed as:
[0040] c(i) = log(d(i) + ε)
[0041] Where d(i) represents the in-degree of entity node i, and ε is a very small constant.
[0042] Finally, the final importance score of the entity node is obtained by combining the initial score and centrality of the entity node:
[0043] s(i)=σ s (c(i)·s(i))
[0044] Where, σ s This is a non-linear activation function. The neighborhood list of the target entity node is obtained by sorting the nodes according to their final importance scores.
[0045] (2) Aggregation method based on feature cross pooling
[0046] The three aggregation methods proposed in KGCN only perform simple summation or concatenation of entity vectors and neighborhood vectors followed by nonlinear transformations, without considering feature combination information, which may result in the loss of important feature information. The Bi-Interaction feature aggregator, however, effectively solves the problem that ranking models need to consider not only each feature individually but also the interactions between features when incorporating features. The Bi-Interaction aggregator combines two features to achieve a nonlinear transformation of the sample space, increasing the model's nonlinear capability to achieve effective prediction for different feature combinations and improve the model's predictive ability for samples with unknown feature combinations. Furthermore, the Bi-Interaction pooling operation reduces network complexity and accelerates network training. This paper introduces a feature cross-pooling layer to aggregate entity neighborhoods:
[0047] f Bi-Interaction =LeakyReLU(W1(e h +e Nh ))+LeakyReLU(W2(e h ⊙e Nh )
[0048] Where W1, W2∈R d'×d It is a trainable weight matrix, e h e is the entity feature vector. Nh Let be the neighborhood feature vector of the entity, and ⊙ denote the element-wise product. The element-wise product operation for the k-th dimension is as follows:
[0049] (e h ⊙e Nh ) k =e hk e Nhk
[0050] By performing pairwise crosses on all feature domains, a series of feature vectors are obtained. Finally, a sum pooling operation is performed on all results. Unlike the neighbor aggregator in KGCN, which only uses the final aggregated neighborhood to represent entities, this paper considers that the information carried by the entity node itself can better describe the node's features. Therefore, the node itself is also aggregated during aggregation to obtain the final entity feature vector. It integrates the initial features of the entity itself and the receptive domain features of layer l. When the model learns each feature, it needs to cross with other features. For example, the gender feature of the singer of a song and the release time of the song should be unrelated, but the model is inevitably affected by the time feature when learning the gender feature, which increases the computational load to a certain extent. Therefore, dropout is introduced to prevent overfitting.
[0051] (3) Rating prediction and model optimization
[0052] Use the user feature vector z obtained from the final aggregation u and project feature vector z i The dot product is used to obtain the user's predicted rating for the item:
[0053]
[0054] The model is updated using gradient descent and optimized using cross-entropy loss. Cross-entropy can be used to calculate the prediction results. The smaller the distance between the prediction and the correct result y, the more accurate the prediction and the better the model performance.
[0055]
[0056] This specific embodiment is merely an explanation of the present invention and is not intended to limit the invention. After reading this specification, those skilled in the art can make modifications to this embodiment without contributing any inventive step, but such modifications are protected by patent law as long as they are within the scope of the claims of the present invention.
Claims
1. A graph convolutional recommendation system that integrates neighbor importance and feature learning, characterized in that, Includes the following steps: (1) Neighbor sampling module The knowledge graph consists of triples (h, r, t), where h represents the head entity, t represents the tail entity, and r represents the relationship between entities. When calculating the importance of entity nodes, user preference for those nodes is considered. The user's score is added to the score for the entity node based on their relationship score. The initial score for entity node i is: s(i)=(1-α)s(u,r)+αs(u,v) Where s(i) represents the initial score of entity node i, the first term represents the score of user node and relation, the second term represents the score of user node and entity i, and α is a hyperparameter used to measure the importance of the scores of user and relation and user and entity. In knowledge graphs, centrality can be used as an indicator to judge the importance or influence of nodes. Centrality can be further divided into degree centrality, betweenness centrality, and proximity centrality. Degree centrality measures the degree to which a node is connected to all other nodes in the graph; betweenness centrality characterizes the importance of a node by the number of shortest paths passing through it; and proximity centrality reflects the proximity of a node to other nodes in the graph. Based on the characteristics of knowledge graphs, the more nodes a node is connected to, the richer the implicit information it may contain. Therefore, this paper uses degree centrality to measure the importance of a node and assumes that the importance of an entity node is positively correlated with its centrality in the knowledge graph, that is, a more central node will be more important than other nodes. The centrality of entity node i is expressed as: c(i) = log(d(i) + ε) Where d(i) represents the in-degree of entity node i, and ε is a very small constant. Finally, the final importance score of the entity node is obtained by combining the initial score and centrality of the entity node: s(i)=σ s (c(i)·s(i)) Where, σ s Using a non-linear activation function, the neighborhood list of the target entity node is obtained by sorting the nodes according to their final importance scores. (2) Aggregation method based on feature cross pooling Introducing a feature cross-pooling layer to aggregate entity neighborhoods: f Bi-Interaction =LeakyReLU(W1(e h +e Nh ))+LeakyReLU(W2(e h ⊙and Nh ) Where W1, W2∈R d'×d It is a trainable weight matrix, e h e is the entity feature vector. Nh Let be the neighborhood feature vector of the entity, and ⊙ denote the element-wise product. The element-wise product operation for the k-th dimension is as follows: (And h ⊙e Nh ) k =and hk And Nhk By performing pairwise crosses on all feature domains, a series of feature vectors after feature crosses can be obtained. Finally, sumpooling is performed on all results. When the model learns each feature, it needs to cross with other features. However, when the model learns gender features, it is inevitably affected by time features, which increases the computational load to a certain extent. Therefore, dropout is introduced to prevent overfitting. (3) Rating prediction and model optimization Use the user feature vector z obtained from the final aggregation u and project feature vector z i The dot product is used to obtain the user's predicted rating for the item: The model is updated using gradient descent, optimized using cross-entropy loss, and the cross-entropy loss function is used to calculate the prediction results. The smaller the distance between the prediction and the correct result y, the more accurate the prediction and the better the model performance.