Multi-behavior recommendation method and device based on contrastive clustering learning and medium
By comparing clustering learning methods, we construct behavior-level, instance-level, and cluster-level embedding tasks, optimize user and item embedding, solve the problem of lack of common and difference information between users or items in multi-behavior recommendation models, and improve recommendation accuracy and performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGXI UNIV
- Filing Date
- 2024-03-11
- Publication Date
- 2026-06-23
Smart Images

Figure CN118094007B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of recommender systems, and more specifically, to a multi-behavioral recommender method, apparatus, and medium based on contrastive clustering learning. Background Technology
[0002] Recommendation systems play a crucial role in many fields, including e-commerce, social media, and online advertising. Through intelligent information filtering and personalized recommendations, they help users quickly find content that matches their interests and needs. Personalized recommendations not only enhance user experience but also facilitate successful business transactions.
[0003] Currently, most recommendation models focus on the relationships between users and items based on single behavior types. For example, Zheng et al. proposed a recommendation model by mining item attribute information from single behavior data. Wang et al. explored higher-order connectivity of user items on single behavior data. Li et al. used a meta-aided learning framework to improve the effectiveness of user and item representations of single behaviors. Lan et al. constructed user-item and item-item bipartite graphs based on historical interaction data to capture information. However, in real-world scenarios, there are various types of interactions between users and items. Figure 1 As shown, user behavior typically includes various actions on e-commerce platforms such as clicking, browsing, saving, adding to cart, and purchasing. Furthermore, the increasing number of items users encounter makes it difficult to infer user preferences based on a single behavior. Therefore, it necessitates the use of auxiliary behaviors for recommendations.
[0004] To fully utilize auxiliary behavioral information, an increasing number of multi-behavior recommendation models have been proposed by researchers both domestically and internationally. Gao et al. proposed a neural multi-task recommendation model to integrate various behavioral information. This model is based on the fact that users typically browse related and interesting items before purchasing, and uses a cascading approach to associate each behavior. Jin et al. proposed a multi-behavior recommendation model to extract behavioral semantics hidden in multi-behavior data. They designed a user-item propagation layer to learn behavior strength and an item propagation layer to capture behavioral semantics. Chen et al. proposed a multi-behavior recommendation model that efficiently learns a neural model from the entire dataset while maintaining controllable time complexity. This model employs three effective optimization methods (user-based, item-based, and alternation-based) to capture the complex relationships between different behaviors. Furthermore, considering the high-order collaborative signals between users and items, Chen et al. proposed a graph heterogeneous collaborative filtering model to predict the relationships between users and items under various behaviors. It uses a relation-aware GCN propagation layer to explicitly utilize high-order collaborative signals. Gu et al. addressed the problem of sparse supervised signals by proposing a multi-behavior recommendation model based on a self-supervised graph neural network. This model uses a star-shaped contrastive learning strategy to explore the commonalities between different behaviors. Wei et al. proposed an attention-based multi-behavior recommendation model based on multiple graphs with different types of behaviors to capture hidden relationships in user-item interaction networks. It considers both the importance of specific behaviors at the node level and the semantic strength of different behaviors at the behavior level.
[0005] Multi-behavior recommendation models are mainly divided into two categories: 1) Models based on the different influences of auxiliary behaviors on target behaviors. Zhang et al. proposed a multi-neural network model based on multi-network structure and graph representation learning to solve the multi-behavior recommendation problem. Wang et al. proposed a multi-relationship prediction model based on user preferences and inter-item relationships in each session. Xia et al. proposed a graph neural multi-behavior enhanced recommendation method based on a graph messaging architecture to model the dependencies between different types of user-item interactions. Xia et al. proposed a knowledge-enhanced hierarchical graph transformer network model to capture specific types of behavioral features and identify these types of user-item interactions. 2) Models based on comprehensive embeddings of users and items learned from different behaviors. Wang et al. proposed a multi-relationship graph neural network model for predicting target behaviors in a session. This method constructs a multi-relationship item graph to learn global inter-item relationships, overcoming the limitations of single-behavior session recommendation. Yang et al. proposed a multi-behavior model based on a hypergraph enhanced transformer to capture short-term and long-term cross-type behavioral dependencies. This method integrates global multi-behavior dependencies and custom sequential context injection into a hypergraph neural structure.
[0006] Contrastive learning, as a promising unsupervised learning paradigm, has achieved excellent performance in recommender systems, mapping user and item embeddings into a feature space. The goal of contrastive learning is to maximize the similarity of positive sample pairs and minimize the similarity of negative sample pairs. In some contrastive learning methods, the k-nearest neighbor method is used to calculate similarity. Furthermore, contrastive learning can effectively alleviate the data redundancy problem in recommender systems. For example, Xuan et al. proposed a recommendation model based on knowledge-enhanced multi-behavior contrastive learning to alleviate the sparsity of target behavior data. It uses multi-behavior contrastive learning and knowledge-aware contrastive learning to minimize the differences between different behaviors of the same user and maximize the differences between different users. Chen et al. proposed a multi-behavior recommendation model based on heterogeneous graph contrastive learning. It utilizes a meta-network to encode personalized features of users and items. Wu et al. proposed a multi-behavior, multi-view contrastive learning recommendation model that effectively alleviates the cold start problem. This model incorporates multi-behavior contrast, multi-view contrast, and behavior discrimination contrast. Liu et al. proposed a graph contrastive method for self-supervised learning of user and item representations.
[0007] The multi-behavior recommendation methods mentioned above lack consideration for the commonalities and differences between users or items, and the sparsity of the data may lead to recommendation bias. Summary of the Invention
[0008] This invention addresses the aforementioned problems in the prior art. Therefore, there is a need for a multi-behavior recommendation method, apparatus, and medium based on contrastive clustering learning to solve the problems of existing multi-behavior recommendation methods lacking consideration of commonalities and differences between users or items, and the potential for recommendation bias due to data sparsity.
[0009] According to a first aspect of the present invention, a multi-behavior recommendation method based on contrastive clustering learning is provided, characterized in that the method comprises:
[0010] The user-item interactions for each behavior are modeled into a bipartite subgraph, where the bipartite subgraph for the k-th behavior is represented as G. k =(V k E k V k It is a subgraph node, E k The edges of the subgraph are given by the adjacency matrix of the bipartite subgraph. It is a set of real numbers, N is the number of users, M is the number of items, and embeddings are obtained using a lightweight graph convolutional neural network, namely user embeddings and item embeddings;
[0011] Three tasks are constructed to improve embedding quality: behavior-level embedding, instance-level embedding, and cluster-level embedding. Behavior-level embedding involves using an adaptive parameter learning strategy to obtain the embedding weights of each user's various behaviors and then aggregating the embedding representations of all behaviors for each user through weighted aggregation. Instance-level embedding involves using contrastive learning to optimize user embeddings and item embeddings under different behaviors. Cluster-level embedding involves using contrastive clustering to learn the potential clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations.
[0012] Combine the three tasks to optimize the embedding.
[0013] Furthermore, embeddings are obtained using a lightweight graph convolutional neural network, including:
[0014] By utilizing multi-layer message propagation and collecting information about connected neighbors, comprehensive node information can be obtained, as defined below:
[0015]
[0016]
[0017] in and Let u and i represent the neighbors of user u and item i, respectively, and L represent the current layer of the graph convolution. This represents the user embedding at the (l+1)th layer for the k-th behavior. This represents the item embedding at the (l+1)th layer for the k-th behavior;
[0018] After obtaining the embedding information for each layer, the information from all layers is aggregated, as defined below:
[0019]
[0020]
[0021] Where α l Let α represent the weights of the embedding at the l-th layer. l Set as Where L represents the number of layers in the graph convolutional network. Indicates user embedding, This indicates that an item is embedded.
[0022] Furthermore, an adaptive parameter learning strategy is used to obtain the embedding weights of each user's individual behaviors, and the embedding representations of all user behaviors are aggregated through weighted aggregation, including:
[0023] Based on the distribution of different behavioral data, adaptive parameters are set for learning to obtain the weights of each behavior, defined as follows:
[0024]
[0025] Where α uk w represents the weight of user u's k-th action. k Let w represent the significance level of the k-th action. k The impact is consistent for all users, x uk Let exp represent the number of items that interacted with user u in the k-th action, and K represent the total number of actions.
[0026] The final user embedding e is obtained by summing the embeddings of all user behaviors using a weighted method. u The definition is as follows:
[0027]
[0028] Where σ represents the nonlinear activation function, and W and b are the weights and biases, respectively.
[0029] Connecting the embeddings of all project behaviors yields the final project embedding. i The definition is as follows:
[0030]
[0031] Where MLP stands for Multilayer Perceptron and Cat represents connection operation;
[0032] Based on the final user embedding e u and project embedding e i We optimize using Bayesian personalized ranking loss, defined as follows:
[0033]
[0034] Where D is the training dataset, including (u, i + ) and (u,i - The diagram is divided into two parts, representing the observed interactions and the unknown interactions, respectively. e u The transpose of , σ represents the sigmoid activation function, e i+ Indicates positive sample embedding, e i- This indicates the negative sample embedding.
[0035] Furthermore, contrastive learning is used to optimize user embeddings and item embeddings under different behaviors, including:
[0036] For the k-th auxiliary behavior, calculate the cosine similarity between the auxiliary behavior embedding and the target behavior embedding. The pairwise cosine similarity between users and items is defined as follows:
[0037]
[0038]
[0039] Where u,u′∈U,i,i′∈I and k1,k2∈{p,k}, the positive pair between the target behavior and the kth auxiliary behavior is selected based on cosine similarity. For user u, the embedding of the target behavior is: The embedding of the kth auxiliary behavior is Select For the i-th item, the embedding of the target behavior is: (The first pair is positive, the second pair is negative, and the remaining 2N-2 pairs are negative.) The embedding of the kth auxiliary behavior is Select The first pair is positive, and the remaining 2M-2 pairs are negative. This represents the pairwise cosine similarity between users u and u′. This represents the embedding of user u under behavior k1. This represents the embedding of user u′ under action k2. Let i represent the pairwise cosine similarity between items i and i′. This represents the embedding of item i under behavior k1. This represents the embedding of item i′ under behavior k2;
[0040] Considering the target action p and the kth auxiliary action, the user and item losses under the target action are... and The definition is as follows:
[0041]
[0042]
[0043] Where τ is the instance-level temperature coefficient controlling softness, and exp represents the empirical function. This indicates the embedding of the user (or item) under the auxiliary behavior k. This indicates the embedding of the user (or item) under the target behavior p. Represents the pairwise cosine similarity between user (or item) i under target behavior p and j′ under auxiliary behavior k;
[0044] Consider the user and item losses under the k-th auxiliary action. and The definition is as follows:
[0045]
[0046]
[0047] in This indicates the embedding of the user (or item) under the auxiliary behavior k. This indicates the embedding of the user (or item) under the target behavior p. Represents the pairwise cosine similarity between user (or item) j′ under target behavior p and i under auxiliary behavior k;
[0048] With the goal of identifying all positive pairs in the dataset, calculate the instance-level loss for each user and item. and The definition is as follows:
[0049]
[0050]
[0051] The instance-level embedding loss is obtained by summing all contrastive losses for users and items, and is defined as follows:
[0052]
[0053] in This represents the instance-level embedding loss.
[0054] Furthermore, contrastive clustering is used to learn latent clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations, including:
[0055] Both the target behavior and each auxiliary behavior are divided into C categories. The c-th cluster used to represent the target behavior The c-th cluster is used to represent the k-th auxiliary behavior. For u, the probability that user u is classified into cluster c under the target behavior is calculated using a normalized exponential function.
[0056] Positive clustering is obtained by selecting users whose target behavior and kth auxiliary behavior are similar. The other 2C-2 pairs are considered negative clusters, and the cosine distance is used to measure the similarity between the clusters, as defined below:
[0057]
[0058] Where c, c′∈{1,2,…,C}, and k1, k2∈{p,k}, represent the k1th and k2th auxiliary actions. Let c represent the pairwise cosine similarity between clusters c and c′. This indicates the embedding of cluster c under behavior k1. This represents the embedding of cluster class c′ under behavior k2;
[0059] Using loss function To distinguish And all of the others except Clustering is defined as follows:
[0060]
[0061] In the formula, τ′ is the cluster-level temperature coefficient that controls the compliance. Let represent the pairwise cosine similarity between cluster class c under target behavior p and cluster class c under auxiliary behavior k. Let represent the pairwise cosine similarity between cluster class c and cluster class c′ under target behavior p. Let represent the pairwise cosine similarity between cluster class c under target behavior p and cluster class c′ under auxiliary behavior k;
[0062] Consider the clustering loss under the k-th auxiliary action. The definition is as follows:
[0063]
[0064] in, Let represent the pairwise cosine similarity between cluster class c under target behavior p and cluster class c under auxiliary behavior k. Let represent the pairwise cosine similarity between cluster class c and cluster class c′ under auxiliary behavior k. Let represent the pairwise cosine similarity between cluster class c′ under target behavior p and cluster class c under auxiliary behavior k;
[0065] The cluster-level loss for combined user embeddings is obtained by traversing all clusters. The definition is as follows:
[0066]
[0067]
[0068]
[0069] Where H(Y) represents the probability of cluster assignment within each batch of embedded samples. entropy, Indicates user embedding loss, express, express, t represents ;
[0070] The user embedding cluster-level loss and the item embedding cluster-level loss for each group are combined to obtain the cluster-level embedding loss.
[0071]
[0072] in, This represents the clustering-level loss of the project embedding.
[0073] Furthermore, the three tasks are combined to optimize the embedding, including:
[0074] Joint optimization is used to combine the three tasks together, as defined below:
[0075]
[0076] Where λ and μ are the hyperparameters controlling instance-level embedding and cluster-level embedding, respectively, Θ represents all trainable parameters in the three tasks, and γ represents the regularization hyperparameter.
[0077] Furthermore, the user-item interactions for each behavior are modeled into bipartite subgraphs, including behaviors such as purchasing, adding to favorites, and browsing.
[0078] Furthermore, a lightweight graph convolutional neural network is used to obtain the embeddings: E (0) This is the embedding representation of the first layer of the graph convolutional network, where L is the number of layers in the graph convolutional network, [α0,…,α]. L ] represents the weight coefficients of each layer of the graph convolutional network, and E is the representation of user embedding or item embedding. It is a symmetric normalized matrix.
[0079] According to a second aspect of the present invention, a multi-agent multi-band cooperative device is provided, the device comprising:
[0080] The embedded acquisition module is configured to model user-item interactions for each behavior into a bipartite subgraph, where the k-th row of the bipartite subgraph is denoted as G. k =(V k E k V k It is a subgraph node, E k The edges of the subgraph are given by the adjacency matrix of the bipartite subgraph. It is a set of real numbers, N is the number of users, M is the number of items, and embeddings are obtained using a lightweight graph convolutional neural network, namely user embeddings and item embeddings;
[0081] The multi-task module is configured to construct three tasks to improve embedding quality. The three tasks are behavior-level embedding, instance-level embedding, and cluster-level embedding. The behavior-level embedding includes using an adaptive parameter learning strategy to obtain the embedding weights of each user's various behaviors and aggregating the embedding representations of all behaviors of each user through weighted aggregation. The instance-level embedding includes using contrastive learning to optimize user embeddings and item embeddings under different behaviors. The cluster-level embedding includes using contrastive clustering to learn the potential clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations.
[0082] The embedding optimization module is configured to combine three tasks to optimize embedding.
[0083] According to a third aspect of the present invention, a readable storage medium is provided, the readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the method described above.
[0084] The present invention has at least the following beneficial effects:
[0085] 1) This invention is simple and effective. Comparison with other methods and testing on known datasets show that it has better performance.
[0086] 2) Since the importance of different behaviors varies—for example, on e-commerce platforms, purchasing is more important than browsing, and adding items to the cart is more important than browsing—this invention utilizes an adaptive parameter α. uk By learning the weights of each behavior, we can effectively aggregate the user embeddings under each behavior, thus obtaining a reasonable and comprehensive end-user embedding.
[0087] 3) Contrast learning maximizes the commonalities of positive pairs of users or items and the differences of negative pairs of users or items. Instance-level contrastive learning considers individual users or items, selecting a sufficient number of negative pairs for each positive pair to obtain local features. Furthermore, cluster-level contrastive learning uses softmax to divide user or item embeddings into multiple clusters, and then performs contrastive learning among these clusters to obtain group features of the data. These two types of contrastive learning, used to delve deeper into the potential information in the data, can effectively alleviate the data sparsity problem and improve recommendation performance. Attached Figure Description
[0088] Figure 1 A multi-behavior diagram is shown;
[0089] Figure 2 A framework diagram of MMRCC according to an embodiment of the present invention is shown;
[0090] Figure 3 A comparison diagram of MBRCC according to an embodiment of the present invention with other benchmark models is shown;
[0091] Figure 4A performance comparison chart of MMRCC and MMRCC-cart, MMRCC-view and MMRCC-click according to an embodiment of the present invention is shown.
[0092] Figure 5 This diagram illustrates the impact of instance-level embeddings and cluster-level embeddings on the Beibei and Taobao datasets according to embodiments of the present invention.
[0093] Figure 6 The performance comparison chart of MMRCC on the Beibei dataset under different λ and μ according to embodiments of the present invention is shown.
[0094] Figure 7 A schematic diagram illustrating the impact of batch size on the Beibei dataset according to an embodiment of the present invention is shown. Detailed Implementation
[0095] To enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. The embodiments of the present invention will be further described in detail below with reference to the accompanying drawings and specific examples, but this is not intended to limit the present invention. If there is no necessary sequential relationship between the various steps described herein, the order in which they are described as examples should not be considered a limitation. Those skilled in the art should understand that the order can be adjusted, as long as it does not disrupt the logical consistency between them and render the entire process impossible.
[0096] This embodiment provides an application scenario in which multi-behavior data is represented as a heterogeneous graph, G = (V, E), where V represents all nodes in the graph, including user nodes u ∈ U and item nodes i ∈ I, where U and I represent the sets of users and items, respectively. E represents edges in the graph with different behavior types. Assuming there are K (K>=2) types of user-item interaction, the edge representing the user-item interaction under the Kth behavior is denoted as E. k Furthermore, graph G can be divided into K subgraphs based on behavior type. For example, in the k-th behavior, the subgraph is defined as G. k =(V k E k The first behavior (purchase) is considered the target behavior among the K behaviors, while the remaining K-1 behaviors are considered auxiliary behaviors (browsing, adding to favorites, and adding to cart, etc.). Multi-behavior recommendations typically require optimizing the target behavior. Information from the auxiliary behaviors is used to optimize the embedding of the target behavior to provide a top-N recommendation list.
[0097] To address the above application scenarios, this invention proposes a multi-behavior recommendation method based on contrastive clustering learning, also known as MBRCC (Contrastive Clustering Learning for Multi-Behavior Recommendation), to predict the relationship between users and items. In summary, MBRCC first uses a graph convolutional network to learn the user and item embeddings for each behavior, and then designs three types of tasks to improve their embedding quality: a) Behavior-level embedding: An adaptive parameter learning strategy is used to obtain the embedding weights for each user's various behaviors, and then a weighted method is used to aggregate the embedding representations of all behaviors for each user; b) Instance-level embedding: Considering the commonalities and differences between different user and item behaviors, a contrastive learning method is used to optimize the user and item embeddings under different behaviors, ensuring that the commonalities of positive sample pairs are maximized and the differences of negative sample pairs are maximized; c) Cluster-level embedding: A contrastive clustering learning method is used to explore the potential clustering information between user embeddings or item embeddings to obtain more comprehensive user and item embedding representations, alleviating the problem of data sparsity. Finally, these three tasks are jointly learned using a weighted method.
[0098] like Figure 2 As shown, MBRCC mainly consists of four parts. In the embedding representation part, the heterogeneous graph G is divided into K subgraphs according to behavior categories. Then, three tasks are designed to obtain complete embedding information. In behavior-level embedding, our method focuses on obtaining the weights of user embeddings on each subgraph. These weights are aggregated with the respective embeddings of the subgraphs to better capture the importance of user behaviors across different subgraphs. For item embeddings, we perform simple join operations because they exhibit static properties. In instance-level embedding, a contrastive learning approach is used to compare the embeddings of users and items for the target behavior (i.e., purchase behavior) with other auxiliary behaviors. A contrastive learning strategy is used to extract instance-level commonalities between users and items as local features. In cluster-level embedding, feature groups for each auxiliary behavior are obtained, and the target behavior is added to these feature groups. Furthermore, a softmax method is used to obtain different feature clusters in each feature group, and contrastive learning is performed between the clusters in each feature group. Finally, these three tasks are combined to optimize the embeddings of users and items.
[0099] This multi-behavior recommendation method based on contrastive clustering learning specifically includes steps 1-5, which are detailed below:
[0100] Step 1. Embedding the expression.
[0101] In subgraph G kBuilding upon this foundation, we utilize graph convolutional networks to learn representations of users and items. In our model, we leverage multi-layer message propagation to obtain comprehensive node information by collecting information from connected neighbors. The definitions are as follows:
[0102]
[0103]
[0104] in and Let represent the neighbors of user u and item i, respectively. L represents the current layer of the graph convolution. This represents the embedding of the user at layer (l+1) for the k-th behavior. After obtaining the embedding information for each layer, to ensure better embedding representation of nodes, the information from all layers needs to be aggregated, defined as follows:
[0105]
[0106]
[0107] Where α l α represents the weight of the embedding at the l-th layer. l The settings follow LightGCN's settings and are set to... Where L represents the number of layers in the graph convolutional network.
[0108] Step 2. Behavior-level embedding.
[0109] Multi-behavioral data can provide more information for user and item embeddings, making the embedded representations of users and items more complete. In behavioral-level embedding, adaptive parameter learning is designed based on the distribution of different behavioral data to obtain the weights of each behavior, defined as follows:
[0110]
[0111] Where α uk w represents the weight of user u's k-th action. k Let w represent the significance level of the k-th action. To simplify the model, assume that for the k-th action, w... k The impact is consistent across all users. uk This represents the number of items that user u interacts with during the k-th action.
[0112] Then, the embeddings of all user behaviors are summed using a weighted method to obtain the final user embedding, defined as follows:
[0113]
[0114] Where σ represents the nonlinear activation function. W and b are the weights and biases, respectively.
[0115] The final embedding of the project is obtained by connecting the embeddings of all behaviors of the project, as defined below:
[0116]
[0117] MLP stands for Multilayer Perceptron. Cat represents the connection operation.
[0118] Through fusion operations, the final user embedding e is obtained. u and project embedding e i To ensure that nodes with closer proximity achieve higher similarity, Bayesian Personalized Ranking (BPR) loss is used for optimization, defined as follows:
[0119]
[0120] Where D is the training dataset. For ease of BPR calculation, it is divided into (u, i) + ) and (u,i - The two parts represent the observed interactions and the unknown interactions, respectively. e u The transpose of .
[0121] Step 3. Instance-level embedding.
[0122] In instance-level embeddings, a contrastive learning approach is used to capture embedding information for users and items. Our goal is to optimize these embeddings by maximizing the similarity between positive pairs of users or positive pairs of items and minimizing the similarity between negative pairs of users or negative pairs of items.
[0123] For a specific k-th auxiliary behavior, calculate the cosine similarity between its embedding and the target behavior embedding. The pairwise cosine similarity between users and items is defined as follows:
[0124]
[0125]
[0126] Where u, u′∈U, i, i′∈I and k1, k2∈{p, k}. The positive pair between the target behavior and the k-th auxiliary behavior is selected based on cosine similarity. For user u, the embedding of the target behavior is: Embedding of the kth auxiliary behavior Select The first pair is positive, and the remaining 2N-2 pairs are negative. For the i-th item, the embedding of the target behavior is: The embedding of the kth auxiliary behavior is Select The first pair is considered positive, and the remaining 2M-2 pairs are considered negative. To optimize pairwise similarity, consider the target action p and the kth auxiliary action. The user and item losses under the target action are defined as follows:
[0127]
[0128]
[0129] Where j′ is the instance-level temperature coefficient controlling softness. and Similarly, considering the user and item losses under the k-th auxiliary action, the definitions are as follows:
[0130]
[0131]
[0132] Our goal is to identify all pairs in the dataset. We compute the instance-level loss for each user and item, defined as follows:
[0133]
[0134]
[0135] Instance-level embedding loss is obtained by summing all contrastive losses for users and items, and is defined as follows:
[0136]
[0137] Step 4. Cluster-level embedding.
[0138] Cluster-level embeddings follow the concept of "grouping by category." In this approach, user and item embeddings are mapped to a space with a dimension equal to the number of clusters. It allows user and item embeddings to be interpreted as probabilities of belonging to a specific cluster. Then, a contrastive clustering learning method is employed to capture clustering features.
[0139] The target behavior and each auxiliary behavior are divided into C categories. The c-th cluster used to represent the target behavior This is used to represent the c-th cluster of the k-th auxiliary behavior. For user u, the softmax method is used to calculate the probability that u is classified into cluster c under the target behavior.
[0140] Similar to instance-level embedding, positive clustering is obtained by selecting users whose target behavior and kth auxiliary behavior are similar. The other 2C-2 pairs are considered negative clusters. Then, cosine distance is used to measure the similarity between these clusters, defined as follows:
[0141]
[0142] Where c, c′∈{1,2,…,C}, k1,k2∈{p,k}. We use the loss function... To distinguish And all of the others except Clustering, defined as follows:
[0143]
[0144] In the formula, τ′ is the cluster-level temperature coefficient controlling the compliance. (And...) Similarly, consider the clustering loss under the k-th auxiliary action, defined as follows:
[0145]
[0146] The cluster-level loss for combined user embeddings is obtained by traversing all clusters, and is defined as follows:
[0147]
[0148]
[0149] Where H(Y) represents the probability of cluster assignment within each batch of embedded samples. The entropy. This helps prevent most samples from being assigned to the same cluster.
[0150] Clustering-level loss for project embedding A method similar to user embedding is employed. Finally, the user embedding loss and item embedding loss for each group are combined to obtain the cluster-level embedding loss:
[0151]
[0152] Step 5. Joint optimization.
[0153] Joint optimization is used to combine the three tasks together, as defined below:
[0154]
[0155] Where λ and μ are the hyperparameters controlling instance-level embedding and cluster-level embedding, respectively. Θ represents all trainable parameters in these tasks, and γ represents the regularization hyperparameter.
[0156] The following embodiments of the present invention will be combined with specific experimental verification to demonstrate the feasibility and progressiveness of the present invention.
[0157] During experimental verification, the MBRCC model will be evaluated from the following aspects.
[0158] • Does RQ1: Does MBRCC outperform other advanced single-behavior and multi-behavior recommendation models in terms of performance?
[0159] • RQ2: In MBRCC, does auxiliary behavioral information have a significant impact on model performance?
[0160] •RQ3: Do both instance-level embedding and cluster-level embedding designs help improve model performance?
[0161] •RQ4: Does the choice of hyperparameters affect the performance of the model?
[0162] •RQ5: Does the choice of training batch size affect the model?
[0163] 1. Dataset and Experiment Configuration
[0164] Dataset Description: Three datasets were selected to evaluate the model's performance, as shown in Table 1. These datasets include Beibei, Taobao, and Tmall. The Beibei dataset contains three types of user behavior: purchase, add to cart, and view. The Taobao dataset also contains three user behaviors: purchase, add to cart, and click. The Tmall dataset contains four user behaviors: purchase, add to cart, collect, and click. In these datasets, two interactions were randomly selected as the test and validation samples for each user, and the remaining interactions were used for training.
[0165] Parameter Configuration: For model optimization, the Adam optimizer with a learning rate of {1e-4, 1e-5} was chosen. The batch size for training was selected from {256, 512, 1024, 2048}. The embedding size was selected from {64, 128, 256}. For the hyperparameters controlling the instance-level embedding and cluster-level embedding tasks (λ and μ), we chose from {0.05, 0.1, 0.2, 0.5, 1.0}. The regularization parameter was also selected from {0.05, 0.1, 0.2, 0.5, 1.0}. The instance-level temperature parameter τ and the cluster-level temperature parameter τ′ were set to {0.1, 0.2, 0.5, 1.0}, respectively.
[0166] Evaluation Metrics: We selected two representative metrics (Recall@K and NDCG@K) to evaluate the performance of our model. Higher metric values indicate better model performance.
[0167] Environment Setup: Our model was implemented in a PyTorch framework environment, using Python 3.9.16 and PyTorch 1.13.1. Hardware included an Intel(R) Xeon(R) Silver 4210R @ 2.40GHz CPU, an NVIDIA Quadro RTX 5000 16G GPU, and 64GB of RAM.
[0168] Table 1 Dataset Table
[0169] Dataset Number of users Item count Number of interactions Behavioral types Beibei 21716 7977 <![CDATA[3.36×10 6 ]]> View, Cart, Purchase Taobao 48749 39493 <![CDATA[2.0×10 6 ]]> Click, Cart, Purchase Tmall 41738 11953 <![CDATA[2.29×10 6 ]]> Click, Cart, Collect, Purchase
[0170] 2. Results Analysis
[0171] Our model MMRCC was compared with eight state-of-the-art models, including four single-behavior models and four multi-behavior models.
[0172] The single-behavior model is:
[0173] •NCF: A generalized collaborative filtering framework that interprets matrix decomposition as neural collaborative filtering;
[0174] • LightGCN: A collaborative filtering recommendation model that removes activation functions and transformation matrices from graph neural networks;
[0175] • IMP-GCN: A graph neural network recommendation model that incorporates interest-aware subgraphs into message passing;
[0176] ·ENMF
[42] : An efficient non-sampled neural matrix factorization recommendation model.
[0177] The multi-behavior model is:
[0178] • MB-GMN: A multi-behavior recommendation model that automatically extracts behavioral heterogeneity and interaction diversity;
[0179] • EHCF: An efficient heterogeneous collaborative filtering model based on non-sampling transfer learning;
[0180] • MB-GCN: A multi-behavior recommendation model that learns behavior strength and captures behavior semantics based on the propagation layers of users and items, respectively;
[0181] S-MBRec: A self-supervised graph collaborative filtering model for multi-behavior recommendation, which uses a star-shaped comparative learning task to capture the commonalities between target behavior and auxiliary behavior.
[0182] The experimental results are analyzed in detail below:
[0183] Multiple experiments (RQ1) were conducted on three datasets, and the results are shown in Table 2. It can be observed that MMRCC achieves improvements exceeding 20% on the Beibei, Taobao, and Tmall datasets, outperforming the baseline model. Notably, when considering the Recall@10 metric for these three datasets, MMRCC achieves impressive enhancements of 72.88%, 69.62%, and 27.67% over the best-performing baseline model, respectively. This improvement can be attributed to MMRCC's ability to capture embedding representations of users and items through three tasks: behavioral embedding, instance-level embedding, and cluster-level embedding. Behavioral embedding reveals the similarity between target and auxiliary behaviors, effectively integrating auxiliary behavior information into the feature representation of the target behavior. Instance-level and cluster-level embeddings enable our model to extract latent information from target and auxiliary behaviors at the instance and cluster levels. This information is then fed into a graph convolutional network to obtain more comprehensive user and item embeddings. Furthermore, instance-level and cluster-level embeddings together form a dual-contrast learning mechanism, which helps alleviate the problem of sparse supervision signals in the datasets. Furthermore, Table 2 shows that multi-behavior recommendation models generally outperform single-behavior recommendation models. This highlights the importance of auxiliary information in improving the performance of target behavior prediction. To better illustrate the performance of our models, the performance of each model on these three datasets is shown below. Figure 3 As shown, it is evident that our model achieves a significant improvement in Recall@10 on the Beibei and Taobao datasets, far outperforming other benchmark models. This is likely because MMRCC utilizes instance-level and cluster-level contrastive learning methods to acquire instance-level and group-level information about users and items. This helps capture the commonalities and differences between target and auxiliary behaviors. Furthermore, MMRCC prioritizes higher-order information about users and items over lower-order information. Therefore, MMRCC outperforms Recall@40 in Recall@10.
[0184] Table 2. Overall performance comparison of different models on the Beibei, Taobao, and Tmall datasets.
[0185]
[0186]
[0187]
[0188] 3. MBRCC Model Analysis
[0189] A) Ablation experiment.
[0190] 1) Impact of Auxiliary Behaviors (RQ2): To verify the importance of auxiliary behavior information in improving the performance of target behavior prediction, we generated three ablation models and compared them with MBRCC:
[0191] • MBRCC-cart. The "add to cart" behavior has been removed from the Beibei and Taobao datasets.
[0192] • MBRCC-view. Browsing behavior has been removed from the Beibei dataset.
[0193] • MPRCC-click. The click behavior has been removed from the Taobao dataset.
[0194] Experimental results are as follows Figure 4 As shown in the experimental results, on the Beibei and Taobao datasets, MMRCC significantly outperforms MMRCC-cart, MMRCC-view, and MMRCC-click. This indicates that auxiliary actions can improve the model's predictive performance. Furthermore, it can be observed that MMRCC-view and MMRCC-click significantly outperform MMRCC-cart. The research suggests that different auxiliary actions have different impacts on predictive performance, with shopping cart actions being more important than viewing and clicking actions.
[0195] 2) Impact of Instance-Level Embeddings and Cluster-Level Embeddings (RQ3): Given that our model is based on multi-behavioral data, behavioral embeddings are fundamental to our method. Therefore, this paper primarily investigates the impact of instance-level embeddings and cluster-level embeddings. We generated three ablation models and compared them with MBRCC on the Beibei and Taobao datasets:
[0196] • MBRCC-instance. Instance-level embeddings have been removed in MBRCC.
[0197] • MBRCC-cluster. Clustering embeddings have been removed from MBRCC.
[0198] • MBRCC-both. Instance-level embedding and cluster-level embedding have been removed in MBRCC.
[0199] The results are as follows Figure 5As shown, MMRCC outperforms MMRCC-instance, MMRCC-cluster, and MMRCC-both in prediction performance. Compared to MMRCC, the prediction performance of MMRCC-instance and MMRCC-cluster is significantly lower, indicating that instance-level embedding and cluster embedding designs can greatly improve the model's prediction performance. Based on the above ablation experiments, it can be concluded that various auxiliary behavioral information can effectively improve recommendation performance. Furthermore, instance-level embedding and cluster-level embedding tasks are essential, as they capture more instance and cluster information to alleviate data sparsity problems. In addition, the ablation experiment results on the Beibei and Taobao datasets are similar, which also shows that our model has good generalization ability on different datasets.
[0200] B) Hyperparameter Analysis (RQ4)
[0201] To verify the impact of hyperparameters λ and μ on MBRCC performance, hyperparameter analysis experiments were conducted on the Beibei dataset. The values of λ and μ were selected from {0.05, 0.1, 0.2, 0.5, 1.0}.
[0202] The results are as follows Figure 6 As shown in Table 3. From Figure 6 It can be seen that: 1) When both λ and μ are 1.0, the Recall@10 value is the best. This indicates that instance-level embeddings and cluster-level embeddings play a more important role in the Beibei dataset. 2) When λ and μ are equal or close, MBRCC exhibits good performance. This phenomenon is consistent with RQ3, indicating that instance-level embeddings and cluster-level embeddings are almost equally important for MBRCC. Furthermore, Table 3 shows that when λ is 1.0, the performance of MBRCC increases with increasing μ. This demonstrates that in the Beibei dataset, instance-level embeddings and cluster-level embeddings are equally important for MBRCC.
[0203] Table 3. Performance comparison of different λ and μ values in NDCG@10
[0204]
[0205]
[0206] C) The impact of batch size (RQ5)
[0207] To examine the impact of batch size on prediction performance, batch size experiments were conducted on the Beibei dataset. Batch sizes were selected from {256, 512, 1024, 2048}. The experimental results are as follows. Figure 7As shown in Table 4, MMRCC outperforms batch sizes of 512, 1024, and 2048 when the batch size is set to 256. Furthermore, MMRCC's performance is similar when the batch size is set to 512, 1024, and 2048. This phenomenon may be because MMRCC focuses on acquiring embedding information for specific users or items. Therefore, choosing a smaller batch size helps to achieve better generalization ability for new users. In addition, the impact of the number of propagation layers in the graph convolutional network on MMRCC performance was tested. Table 4 shows that MMRCC performs best when L=1, and its performance gradually decreases as L increases. This phenomenon may be because our model, in the contrastive learning of instance-level embeddings and cluster-level embeddings, can learn more information from a sufficient number of negative pairs for each positive pair. Therefore, performance may decrease as the number of layers L increases, as higher-order neighbors may introduce new noise.
[0208] Table 4 Performance at different propagation depths L
[0209]
[0210]
[0211] Furthermore, although exemplary embodiments have been described herein, their scope includes any and all embodiments based on the invention that have equivalent elements, modifications, omissions, combinations (e.g., schemes involving intersections of various embodiments), adaptations, or alterations. Elements in the claims will be interpreted broadly based on the language used in the claims and are not limited to the examples described in this specification or during the implementation of this application, and such examples will be interpreted as non-exclusive. Therefore, this specification and examples are intended to be considered illustrative only, and the true scope and spirit are indicated by the following claims and the full scope of their equivalents.
[0212] The above description is intended to be illustrative and not restrictive. For example, the above examples (or one or more of them) can be used in combination with each other. Other embodiments can be used by those skilled in the art when reading the above description. Furthermore, in the above detailed description, various features may be grouped together to simplify the invention. This should not be construed as an intention that a feature of an unclaimed invention is necessary for any claim. Rather, the subject matter of the invention may be less than all the features of a particular embodiment of the invention. Thus, the following claims are incorporated herein by reference as examples or embodiments, wherein each claim is an independent, separate embodiment, and these embodiments are contemplated to be combined with each other in various combinations or arrangements. The scope of the invention should be determined by reference to the appended claims and the full scope of their equivalents.
Claims
1. A multi-behavior recommendation method based on contrastive clustering learning, characterized in that, The method includes: According to the user-item interactions of each behavior, a bipartite subgraph is modeled, wherein the first part is a user set and the second part is an item set The behavior bipartite subgraph is represented as , is a subgraph node, is an edge of the subgraph, and an adjacency matrix of the bipartite subgraph is , is a real number set, is a user number, is an item number, and an embedding is obtained by using a lightweight graph convolutional neural network, wherein the embedding is a user embedding and an item embedding; Three tasks are constructed to improve embedding quality: behavior-level embedding, instance-level embedding, and cluster-level embedding. Behavior-level embedding involves using an adaptive parameter learning strategy to obtain the embedding weights of each user's various behaviors and then aggregating the embedding representations of all behaviors for each user through weighted aggregation. Instance-level embedding involves using contrastive learning to optimize user embeddings and item embeddings under different behaviors. Cluster-level embedding involves using contrastive clustering to learn the potential clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations. Combine the three tasks to optimize the embedding; Optimize user embeddings and item embeddings under different behaviors using contrastive learning, including: For the first k auxiliary behavior, the cosine similarity between the auxiliary behavior embedding and the target behavior embedding is calculated, and the pairwise cosine similarity of the user and the item is defined as follows: in , and Selecting target behavior based on cosine similarity and the first The positive pairing between auxiliary behaviors, for the user The embedding of the target behavior is , No. The embedding of each auxiliary behavior is Select The correct one is the one that is facing the right way; the rest are the correct ones. For the negative pair, for the first pair The item, the embedded target behavior is , No. The embedding of each auxiliary behavior is Select The correct one is the one that is facing the right way; the rest are the correct ones. The opposite is a negative pair. Indicates user and Pairwise cosine similarity, express User behavior Embedded, express User behavior Embedded, Represents items and Pairwise cosine similarity, express Items under behavior Embedded, express Items under behavior Embedding; Consider target behavior and the Auxiliary behavior, resulting in user and item losses under the target behavior. and The definition is as follows: in For instance-level temperature coefficients that control softness, exp represents an empirical function. Indicates auxiliary behavior Item embedding, Indicates target behavior Item embedding, Indicates target behavior Lowering items and auxiliary behaviors Down Pairwise cosine similarity; Consider the first User and item losses due to secondary auxiliary actions and The definition is as follows: in Indicates target behavior Lowering items and auxiliary behaviors Down Pairwise cosine similarity; With the goal of identifying all positive pairs in the dataset, calculate the instance-level loss for each user and item. and The definition is as follows: The instance-level embedding loss is obtained by summing all contrastive losses for users and items, and is defined as follows: in Indicates instance-level embedding loss; The user-item interactions for each behavior are modeled into a bipartite subgraph, which includes behaviors such as purchasing, adding to favorites, and browsing.
2. The method according to claim 1, characterized in that, Embeddings are obtained using lightweight graph convolutional neural networks, including: By utilizing multi-layer message propagation and collecting information about connected neighbors, comprehensive node information can be obtained, as defined below: in and Representing users respectively and items Neighbors Indicates the first The first action User embedding of the layer, Indicates the first The first action Layered item embedding, Indicates user embedding, Indicates that an item is embedded; After obtaining the embedding information for each layer, the information from all layers is aggregated, as defined below: in Indicates the first The weights of layer embedding, Set as ,in This indicates the number of layers in the graph convolutional network.
3. The method according to claim 2, characterized in that, An adaptive parameter learning strategy is used to obtain the embedding weights of each user's actions, and the embedding representations of all actions of each user are aggregated through weighted aggregation, including: Based on the distribution of different behavioral data, adaptive parameters are set for learning to obtain the weights of each behavior, defined as follows: in Indicates user The The weight of each behavior Indicates the first The significance level of the first behavior, for the second behavior One behavior, The impact is consistent for all users. Indicates the first Under these behaviors and users The number of items interacted with, where exp represents the empirical function. Indicates the total number of actions; The final user embedding is obtained by summing the embedding values of all user behaviors using a weighted method. The definition is as follows: in Represents a non-linear activation function. and It consists of weights and biases. Connecting the embeddings of all project behaviors yields the final project embedding. The definition is as follows: in This represents a multilayer perceptron. Indicates a connection operation; Based on the final user embedding and project embedding We optimize using Bayesian personalized ranking loss, defined as follows: in For the training dataset, including and The two parts represent the observed interactions and the unknown interactions, respectively. express transpose, This represents the sigmoid activation function. Indicates positive sample embedding. This represents the negative sample embedding.
4. The method according to claim 3, characterized in that, Contrastive clustering is used to learn latent clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations, including: For the target behavior and each auxiliary behavior, they are respectively divided into kind, The first one used to represent the target behavior Clusters, Used to indicate the first The first auxiliary behavior For a cluster, Calculate user data using a normalized exponential function. Classified under target behavior probability of clusters ; By selecting the target behavior and the first Users with similar auxiliary behaviors are identified as positive clusters. And other For clusters considered as negative clusters, cosine distance is used to measure the similarity between clusters, as defined below: in , , indicating the first and An auxiliary behavior, Representing a cluster and Pairwise cosine similarity, express Embedding of cluster c under behavior, express Clusters under behavior Embedding; Using loss function To distinguish And all of the others except Clustering is defined as follows: In the formula To control the cluster-level temperature coefficient of flexibility, Indicates target behavior Lower clusters and auxiliary behaviors Lower clusters Pairwise cosine similarity, Indicates target behavior Lower clusters and target behavior Lower clusters Pairwise cosine similarity, Indicates target behavior Lower clusters and auxiliary behaviors Lower clusters Pairwise cosine similarity; Consider the first Clustering loss under secondary auxiliary behavior The definition is as follows: in, Indicates target behavior Lower clusters and auxiliary behaviors Lower clusters Pairwise cosine similarity, Indicates auxiliary behavior Lower clusters and auxiliary behaviors Lower clusters Pairwise cosine similarity, Indicates target behavior Lower clusters and auxiliary behaviors Lower clusters Pairwise cosine similarity; The cluster-level loss for combined user embeddings is obtained by traversing all clusters. The definition is as follows: in Indicates the probability of cluster assignment within each batch of embedded samples. entropy The user embedding cluster-level loss and the item embedding cluster-level loss for each group are combined to obtain the cluster-level embedding loss. : in, This represents the clustering-level loss of the project embedding.
5. The method according to claim 4, characterized in that, The three tasks are combined to optimize the embedding, including: Joint optimization is used to combine the three tasks together, as defined below: in and These are the hyperparameters that control instance-level embedding and cluster-level embedding, respectively. This represents all trainable parameters across the three tasks. This represents the regularization hyperparameter.
6. The method according to claim 5, characterized in that, Obtaining embeddings using lightweight graph convolutional neural networks: , It is the first layer of graph convolutional network embedding representation. It is the number of layers in the graph convolutional network. These are the weight coefficients of each layer of the graph convolutional network. It is a representation of user embedding or item embedding. It is a symmetric normalized matrix.
7. A multi-behavior recommendation device based on contrastive clustering learning, used to implement the method as described in any one of claims 1 to 6, characterized in that, The device includes: The embedded acquisition module is configured to model user-item interactions into a bipartite subgraph based on each behavior, where the first... Behavioral bipartite subgraph representation as , It is a subgraph node. The edges of the subgraph are given by the adjacency matrix of the bipartite subgraph. , It is the set of real numbers. It's the number of users. The number of items is used to obtain embeddings using a lightweight graph convolutional neural network, which includes user embeddings and item embeddings. The multi-task module is configured to construct three tasks to improve embedding quality. The three tasks are behavior-level embedding, instance-level embedding, and cluster-level embedding. The behavior-level embedding includes using an adaptive parameter learning strategy to obtain the embedding weights of each user's various behaviors and aggregating the embedding representations of all behaviors of each user through weighted aggregation. The instance-level embedding includes using contrastive learning to optimize user embeddings and item embeddings under different behaviors. The cluster-level embedding includes using contrastive clustering to learn the potential clustering information between user embeddings or item embeddings to obtain new user embedding representations and item embedding representations. The embedding optimization module is configured to combine three tasks to optimize embedding.
8. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, perform the method according to any one of claims 1 to 6.