A resource recommendation method, apparatus, device, medium and program product
By analyzing the interaction sequence features of resources and objects through meta-learning and attention mechanisms, the problem of insufficient data in resource recommendation platforms is solved, enabling more accurate resource recommendations and improving recommendation performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2022-11-02
- Publication Date
- 2026-06-19
AI Technical Summary
Existing resource recommendation platforms suffer from poor recommendation performance due to a lack of comprehensive data, failing to accurately recommend resources of interest to users.
By employing meta-learning and attention mechanisms, the interaction preferences of the target object are captured by acquiring the historical interaction sequence features of candidate resources and target objects, and resources that match it are recommended.
It improves the accuracy and efficiency of resource recommendations, ensuring that resources of interest are pushed to the target audience, thus enhancing the resource recommendation experience and click-through rate.
Smart Images

Figure CN116992123B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, specifically to the field of machine learning, and in particular to a resource recommendation method, a resource recommendation device, a computer device, a computer-readable storage medium, and a computer program product. Background Technology
[0002] With the rapid development of internet technology, various internet resources (or simply resources, such as audio, video, or virtual items) are widely disseminated on the internet thanks to its speed and simplicity.
[0003] Currently, resource recommendation platforms recommend resources to users based on data from both the resource side (such as information about the resource itself, like its category, name, or publication time) and the user side (such as information about the user, like their nickname, gender, or address). However, resource recommendation platforms do not possess rich data on all resources or users, leading to poor recommendation performance. Therefore, how to recommend resources of interest to users has become a crucial issue to consider in the field of resource recommendation. Summary of the Invention
[0004] This application provides a resource recommendation method, apparatus, device, medium, and program product that can efficiently capture the interaction preferences of target objects and more accurately recommend resources of interest to target objects.
[0005] On the one hand, embodiments of this application provide a resource recommendation method, which includes:
[0006] Obtain a set of candidate resources to be recommended. The set contains M candidate resources and the sequence features of each candidate resource; the j-th candidate resource corresponds to Q. j Given a sequence of historical interactions, the sequence features of the i-th candidate resource are based on meta-learning and attention mechanisms, and are applied to Q. j The interaction sequence is obtained by encoding the interaction sequence pair consisting of the historical interaction sequence and the i-th candidate resource; the sequence features of the i-th candidate resource are used to characterize: the i-th candidate resource and Q. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M;
[0007] Obtain the encoding features of the target historical interaction sequence of the target object to be recommended. The encoding features of the target historical interaction sequence are used to characterize: the degree of attention the target object pays to the sequential dependencies between multiple historical resources contained in the target historical interaction sequence;
[0008] The sequence features of the candidate resource set are identified, and the target resources that match the encoded features of the target object's target historical interaction sequence are recommended to the target object.
[0009] On the other hand, embodiments of this application provide a resource recommendation device, which includes:
[0010] The acquisition unit is used to acquire a set of candidate resources to be recommended. The set of candidate resources contains M candidate resources and the sequence features of each candidate resource; the j-th candidate resource corresponds to Q. j Given a sequence of historical interactions, the sequence features of the i-th candidate resource are based on meta-learning and attention mechanisms, and are applied to Q. j The interaction sequence is obtained by encoding the interaction sequence pair consisting of the historical interaction sequence and the i-th candidate resource; the sequence features of the i-th candidate resource are used to characterize: the i-th candidate resource and Q. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M;
[0011] The processing unit is used to obtain the encoded features of the target historical interaction sequence of the target object to be recommended. The encoded features of the target historical interaction sequence are used to characterize the degree of attention the target object pays to the sequential dependencies between multiple historical resources contained in the target historical interaction sequence.
[0012] The processing unit is also used to determine the sequence features of the target resources from the candidate resource set, the target resources that match the encoded features of the target historical interaction sequence of the target object, and recommend the target resources to the target object.
[0013] In one implementation, the acquisition unit, when acquiring the set of candidate resources to be recommended, is specifically used for:
[0014] Obtain M candidate resources to be recommended, and collect multiple historical interaction sequences for each candidate resource; any historical interaction sequence is obtained based on the operation of multiple historical resources triggered by the same object in a historical time period.
[0015] Based on multiple historical interaction sequences corresponding to each candidate resource, each candidate resource is encoded to obtain the sequence features of each candidate resource;
[0016] The M candidate resources to be recommended and the sequence features of each candidate resource are combined to form a set of candidate resources to be recommended.
[0017] In one implementation, the processing unit, when encoding each candidate resource based on multiple historical interaction sequences corresponding to each candidate resource to obtain the sequence features of each candidate resource, specifically performs the following:
[0018] According to the filling rules of meta-learning, each candidate resource is filled into multiple corresponding historical interaction sequences, resulting in multiple historical interaction sequence pairs corresponding to each candidate resource; a historical interaction sequence pair includes a historical interaction sequence and a candidate resource.
[0019] Encode multiple historical interaction sequence pairs corresponding to each candidate resource to obtain the sequence features of each candidate resource.
[0020] In one implementation, the processing unit, used to encode multiple historical interaction sequence pairs corresponding to the j-th candidate resource to obtain the sequence features of the j-th candidate resource, specifically performs the following:
[0021] For the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoded features of each historical interaction sequence pair.
[0022] The encoded features of each historical interaction sequence pair are aggregated to obtain the sequence features of the j-th candidate resource.
[0023] In one implementation, the processing unit, when encoding each historical interaction sequence pair among multiple historical interaction sequence pairs of the j-th candidate resource to obtain the encoded features of each historical interaction sequence pair, specifically performs the following:
[0024] For multiple historical interaction sequence pairs of the j-th candidate resource, short-term preference modeling is performed on each historical interaction sequence pair to obtain the short-term encoding features of each historical interaction sequence pair.
[0025] For multiple historical interaction sequence pairs of the j-th candidate resource, long-term preference modeling is performed on each historical interaction sequence pair to obtain the long-term encoding features of each historical interaction sequence pair.
[0026] Based on the attention mechanism, the short-term coding features and corresponding long-term coding features of each historical interaction sequence pair are fused to obtain the coding features of each historical interaction sequence pair.
[0027] In one implementation, the processing unit, used to aggregate the encoded features of each historical interaction sequence pair to obtain the sequence features of the j-th candidate resource, specifically performs the following:
[0028] The encoded features of multiple historical interaction sequence pairs corresponding to the j-th candidate resource are subjected to mean pooling to obtain pooled features.
[0029] The pooling feature is used as the sequence feature of the j-th candidate resource.
[0030] In one implementation, the processing unit, when acquiring the encoded features of the target historical interaction sequence of the target object to be recommended, specifically performs the following:
[0031] Obtain the target historical interaction sequence of the target object to be recommended;
[0032] Short-term preference modeling is performed on the target historical interaction sequence to obtain the short-term encoding features of the target historical interaction sequence; and long-term preference modeling is performed on the target historical interaction sequence to obtain the long-term encoding features of the target historical interaction sequence.
[0033] Based on the attention mechanism, the short-term and long-term coding features of the target's historical interaction sequence are fused to obtain the coding features of the target's historical interaction sequence.
[0034] In one implementation, the processing unit, when determining a target resource whose sequence features match the encoded features of the target object's target historical interaction sequence from the candidate resource set, specifically performs the following:
[0035] The encoded features of the target historical interaction sequence are compared with the sequence features of each candidate resource in the candidate resource set to obtain M similarity results.
[0036] From M similarity calculation results, identify one or more target similarity calculation results that have a similarity calculation result greater than the similarity threshold;
[0037] The sequence features corresponding to the similarity calculation results of each target are determined as target sequence features that match the encoded features of the target's historical interaction sequence;
[0038] Candidate resources corresponding to the features of the target sequence are used as target resources.
[0039] In one implementation, the resource recommendation method is executed by calling a trained sequence model. The training process of the sequence model includes:
[0040] Obtain a set of training tasks, which contains multiple training tasks; each training task contains multiple pairs of training resources, and each pair of training resources contains multiple pairs of training interaction sequences and one pair of test interaction sequences; each pair of training interaction sequences and each pair of test interaction sequences contains a historical interaction sequence and a corresponding training resource.
[0041] Select the Tth training task set i There are 1 training task, where i is a positive integer;
[0042] Call the sequence model on the Tth i Encode the sample pairs of each training resource contained in each training task to obtain the Tth training task. iThe sequence feature set and encoded feature set for each training task; the sequence feature set contains the Tth sequence feature set. i The sequence features of each training resource in each training task, the encoded feature set contains the Tth... i Encoded features of each training resource in each training task;
[0043] Based on the sequence feature set, the encoded feature set, and the Tth... i For each training task, multiple test interaction sequence pairs are used to calculate the loss information; and the model parameters of the sequence model are updated in the direction of decreasing loss information to obtain the updated sequence model.
[0044] Reselect the Tth task from the training task set i+1 The training task is t, and the Tth training task is used. i+1 Each training task iteratively trains the updated sequence model until the sequence model tends to stabilize.
[0045] In one implementation, the Tth i The process of constructing a training task includes:
[0046] For the Tth i Each training task collects multiple training resources and collects K+1 historical interaction sequences for each training resource; the K+1 historical interaction sequences belong to different objects.
[0047] According to the filling rules of meta-learning, each training resource is filled into the corresponding K+1 historical interaction sequences to obtain K+1 historical interaction sequence pairs for each training resource;
[0048] From the K+1 historical interaction sequence pairs of each training resource, select one historical interaction sequence pair as the test interaction sequence pair of the corresponding training resource, and take the K historical interaction sequence pairs other than the test interaction sequence pair from the K+1 historical interaction sequence pairs as the training interaction sequence pairs of the corresponding training resource.
[0049] The K training interaction sequence pairs and one test interaction sequence pair for each training resource constitute the Tth training resource. i One training task.
[0050] In one implementation, the Tth i Any training resource contained in a training task is represented as a target training resource, and the sample pair corresponding to the target training resource is represented as a target sample pair;
[0051] The processing unit, used to call the sequence model to encode the target sample pairs and obtain the sequence features and encoded features of the target training resources, is specifically used for:
[0052] Encode each training interaction sequence pair among multiple training interaction sequence pairs contained in the target sample pair to obtain the encoded features of each training interaction sequence pair; then aggregate the encoded features of each training interaction sequence pair to obtain the sequence features of the target training resource; and,
[0053] The historical interaction sequences, excluding the target training resources, in the test interaction sequence pairs contained in the target sample pair are encoded to obtain the encoded features of the target training resources.
[0054] In one implementation, a processing unit is used to encode a feature set and a T-th feature set based on a sequence feature set. i When calculating loss information for multiple test interaction sequence pairs corresponding to a training task, they are specifically used for:
[0055] Each encoded feature in the encoded feature set is compared with each sequence feature in the sequence feature set to calculate similarity, resulting in multiple similarity calculation results; and,
[0056] The training resources contained in the test interaction sequence pairs corresponding to each encoded feature in the encoded feature set are compared with the Tth... i Each test interaction sequence corresponding to a training task is compared with the training resources it contains, and multiple comparison results are obtained.
[0057] Based on the loss function, multiple similarity calculation results, and corresponding comparison results, loss information is obtained.
[0058] In one implementation, when the resource recommendation method is applied to a cold start scenario, both training resources and candidate resources are cold start resources; cold start resources refer to resources in the resource recommendation platform whose data volume of resource interaction data is less than the data volume threshold.
[0059] On the other hand, embodiments of this application provide a computer device, the device comprising:
[0060] A processor is used to load and execute computer programs;
[0061] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the resource recommendation method described above.
[0062] On the other hand, embodiments of this application provide a computer-readable storage medium storing a computer program adapted to be loaded by a processor and executed by the above-described resource recommendation method.
[0063] On the other hand, embodiments of this application provide a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and when executed by the processor, the computer instructions implement the resource recommendation method described above.
[0064] This application supports determining sequence features for each candidate resource in a set of candidate resources to be recommended based on meta-learning, attention mechanisms, and historical interaction sequences (containing multiple historical resources that are triggered sequentially within a historical time period). The sequence features of any candidate resource can be used to characterize the sequential dependency relationship between the candidate resource and each historical resource contained in the historical interaction sequence; for example, the triggering of resource 2 (bait) depends on the triggering of resource 1 (hook). By pre-analyzing the triggering sequence dependency relationship between each candidate resource and historical resources, when there is a need for resource distribution (or recommendation) (such as obtaining a resource distribution request initiated by the target object to be recommended), it is possible to quickly match the target resource that matches the target object's interaction preference from the sequence features of M candidate resources based on the target object's target historical interaction sequence encoding features (such as the degree of attention (or preference, preference, etc.) of the target interaction object to a certain sequential dependency relationship, i.e., reflecting the target object's preference for resource interaction), where M is a positive integer; this not only improves the accuracy of resource recommendation but also ensures the efficiency of resource recommendation. In summary, this application provides a novel resource recommendation scheme that supports the analysis of sequential dependencies between resources (including candidate resources to be recommended) and, based on the interaction preferences reflected in the target object's target historical interaction sequence, determines the target resource that matches the target object's interaction preferences from multiple candidate resources to be recommended. This enables more accurate delivery of resources of interest to the target object, improving the resource recommendation experience for the target object while also increasing the click-through rate of the target resource. Attached Figure Description
[0065] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0066] Figure 1a This is a schematic diagram of a model training process provided in an exemplary embodiment of this application;
[0067] Figure 1b This is a schematic diagram of another model training process provided in an exemplary embodiment of this application;
[0068] Figure 2 This is a schematic diagram of the architecture of a resource recommendation system provided in an exemplary embodiment of this application;
[0069] Figure 3 This is a flowchart illustrating a resource recommendation method provided in an exemplary embodiment of this application;
[0070] Figure 4 This is a schematic diagram illustrating the construction of a training task set according to an exemplary embodiment of this application;
[0071] Figure 5 This is a schematic diagram illustrating another method for constructing a training task set, provided by an exemplary embodiment of this application.
[0072] Figure 6 This is a schematic diagram illustrating yet another method for constructing a training task set, provided by an exemplary embodiment of this application;
[0073] Figure 7 This is a schematic diagram of a short-term preference modeling process provided in an exemplary embodiment of this application;
[0074] Figure 8 This application provides an exemplary embodiment of a method for generating the Tth generation. i A flowchart illustrating the sequence feature set for each training task;
[0075] Figure 9 This application provides an exemplary embodiment of a method for generating the Tth generation. i A flowchart illustrating the process of encoding feature sets for a training task;
[0076] Figure 10 This is a flowchart illustrating another resource recommendation method provided in an exemplary embodiment of this application;
[0077] Figure 11 This is a schematic diagram illustrating the selection of target resources from a set of candidate resources, provided by an exemplary embodiment of this application.
[0078] Figure 12 This is a schematic diagram of the structure of a resource recommendation device provided in an exemplary embodiment of this application;
[0079] Figure 13 This is a schematic diagram of the structure of a computer device provided in an exemplary embodiment of this application. Detailed Implementation
[0080] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
[0081] This application provides a resource recommendation scheme, specifically a resource distribution processing scheme. The technical terms and related concepts involved in the resource recommendation scheme provided in this application are briefly introduced below:
[0082] I. Artificial Intelligence (AI).
[0083] Artificial intelligence (AI) is the theory, methods, technology, and application systems that utilize digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to achieve optimal results. AI technology is a comprehensive discipline involving a wide range of fields, encompassing both hardware and software technologies. Fundamental AI technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies mainly include computer vision, speech processing, natural language processing, and machine learning / deep learning. Machine learning (ML) is a multidisciplinary field involving probability theory, statistics, approximation theory, convex analysis, and algorithm complexity theory. It specifically studies how computers can simulate or implement human learning behavior to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, formulaic learning, and meta-learning.
[0084] The resource recommendation scheme provided in this application mainly involves artificial neural networks in machine learning and meta-learning in deep learning; the relevant content of artificial neural networks and meta-learning is briefly introduced below; wherein:
[0085] (1) Artificial neural networks are a method for implementing machine learning tasks. In the field of machine learning, when discussing neural networks, it generally refers to "neural network learning." It is a network structure composed of many simple elements. This network structure is similar to a biological nervous system and is used to simulate the interaction between organisms and the natural environment. The more network structures there are, the richer the functions of the neural network tend to be. Neural network is a relatively broad concept. For different learning tasks such as speech, text, and images, neural network models more suitable for specific learning tasks have been derived, such as recurrent neural networks (RNNs). A recurrent neural network (RNN) is a neural network used to process sequential data (or simply sequences). Specifically, it takes sequential data (such as a sequence of multiple resources triggered sequentially according to time in a resource recommendation scenario) as network input and performs recursion in the evolution direction of the sequential data (such as the time sequence direction).
[0086] Furthermore, this application specifically relates to Long Short-Term Memory (LSTM) networks in Recurrent Neural Networks (RNNs). LSTM networks are a special type of temporal recurrent neural network. Compared to ordinary RNNs, LSTM networks, by adding memory unit structures and three gates (input gate, forget gate, and output gate), can extract features from sequential data to selectively memorize important information, filter out noise, and reduce the memory burden. This avoids the gradient vanishing and gradient exploding problems during long sequence training; it also performs better on longer time series, solving the long-term dependency problem inherent in RNNs. For example, in a long text sequence containing a lot of text content, a certain word may have different meanings depending on the preceding text content. When processing this long text sequence using an LSTM network, even if the text sequence is long (containing a lot of text content), the LSTM network can selectively memorize important preceding text content and filter out unimportant text content, thereby helping to better understand the meaning of a certain word in the long text sequence.
[0087] (2) Meta-learning, also known as "learning to learn," can be considered a development in deep learning. It supports using past knowledge and experience to guide the learning of new tasks, enabling the network to learn how to learn. The main idea of meta-learning is to learn from tasks similar to the target task (i.e., the new task) to acquire prior knowledge, allowing for rapid adaptation in test tasks. Compared to traditional machine learning, meta-learning training sets often contain multiple tasks, while machine learning training sets contain only one task. In meta-learning, tasks are used as learning units, and each task contains a training set (also known as a support set) and a test set (also known as a query set). By learning from multiple tasks similar to the target task, meta-learning has the following advantages: even with a small number of training samples for the target task, it can quickly learn from a small number of training samples based on prior knowledge acquired from learning from similar tasks, achieving good performance.
[0088] II. Resource Recommendations.
[0089] Resource recommendation, also known as resource distribution, refers to the process of distributing resources contained on a resource recommendation platform (or system) to platform users (such as one or more resource recipients who have registered a platform account or are temporarily logged into the platform). The resources contained on the resource recommendation platform can be referred to as internet resources (or virtual resources), and may include, but are not limited to: videos (which can be categorized as long or short videos based on their playback length), audio (such as music or voice recordings), animation, or documents (such as journals or academic papers), etc. This application does not limit the specific types and content of the resources described in its embodiments.
[0090] A resource recommendation platform can refer to an application that supports the distribution or recommendation of resources. An application can refer to a computer program designed to perform one or more specific tasks. Classifying applications according to different dimensions (such as how the application runs, its functions, etc.) can yield different types of the same application across these dimensions. For example, based on how the application runs, applications can include, but are not limited to: clients installed on a terminal, small programs that can be used without downloading and installation (as subroutines of the client), and web (World Wide Web) applications opened through a browser. Another example is based on the type of application function, which can include, but is not limited to: IM (Instant Messaging) applications, content interaction applications, etc. Instant messaging applications refer to applications that facilitate instant messaging and social interaction over the internet, and can include, but are not limited to: social applications with communication functions, map applications with social interaction functions, game applications, etc. Content interaction applications refer to applications capable of content interaction, such as online banking, sharing platforms, personal spaces, news applications, etc.
[0091] Furthermore, the resource recommendation platform can also be a plugin (or function) included in the aforementioned applications that supports resource recommendations. For example, if the application is the instant messaging application mentioned above, then the resource recommendation platform can be a resource recommendation plugin included in that instant messaging application; if the resource is a short video, the resource recommendation function provided by the instant messaging application is a short video recommendation function; in this way, the target object (such as any object using the instant messaging application) can also perform functions such as browsing and posting resources while using the instant messaging application for social interaction, without the need for application switching (such as switching from the instant messaging application to a separate resource recommendation application).
[0092] It should be noted that the types of resources distributed by the same resource recommendation platform are not limited to one. For example, the resource types supported by the same resource recommendation platform may simultaneously include videos and documents. Furthermore, this application does not limit the types of resources distributed by the resource recommendation platform, or which type of application or application provides the resource recommendation function. For ease of explanation, subsequent embodiments will use short videos as an example of resources distributed by the resource recommendation platform (or resource recommendation system).
[0093] It should also be noted that different resource recommendation systems employ different resource recommendation strategies. Traditional recommendation systems extract resource features from information on the resource side and then recommend the resource to a list of objects whose features match the user's preferences. For example, if the resource is a short video, information from the short video side (such as the video's title, type, upload time, or content) can be used to recommend the short video to users who prefer the same information (e.g., videos about food), thus recommending short videos that the user might be interested in. Unlike traditional resource recommendation strategies, sequential recommendation systems (SRS) do not focus on information on the resource side or the user side, but rather on the dependency relationship between the user and the resource. Specifically, by modeling the interaction sequence of the target user, SRS learns the changes in the target user's interest in resources, thereby predicting the target user's next interaction preference. The interaction sequence can include a sequence of short videos that the target user interacts with (or clicks on, triggers) in chronological order; that is, the interaction sequence includes multiple short videos triggered sequentially by the target user. Therefore, it can be seen that sequence recommendation systems do not need to store resource-side information and object-side information; they only need to store resource interaction data between objects and resources.
[0094] Based on the above introduction of relevant terms and concepts, this application proposes a resource recommendation scheme based on meta-learning and long-short-term preference modeling (including long-term preference modeling (such as the long-term interests of the learning object) and short-term preference modeling (such as the short-term interests of the learning object)). When applied to a sequence recommendation system, this scheme enables the sequence recommendation system to achieve more accurate resource recommendations. Specifically, the resource recommendation scheme proposed in this application can include two parts: model training and model application. The general implementation process of model training and model application is briefly introduced below.
[0095] (1) Model training.
[0096] The model training involved in the embodiments of this application includes the training of sequence models; wherein, model training may consist of the following modules (such as... Figure 1aThe implementation (as shown) includes a problem abstraction module, a meta-learning module, an encoding module, and a training module. Specifically: The problem abstraction module defines the basic entities (such as resources and objects) and training objectives (e.g., the training objective of a sequence model is to predict the resources that a target object will interact with next based on historical interaction sequences) in a sequence recommendation scenario (or resource recommendation scenario). The meta-learning module applies meta-learning to the sequence recommendation scenario; specifically, it constructs interaction sequence pairs and multiple training tasks to form the training task set required for model training. The encoding module encodes the interaction sequence pairs (converting the input interaction sequence pairs into vector form). The training module calculates the similarity between the sequence feature set (i.e., the set of sequence features corresponding to the training set) and the encoded feature set (i.e., the set of encoded features corresponding to the test set) and calculates the loss function of the sequence model.
[0097] For a general overview of the model training process, please refer to [link / reference]. Figure 1b ,like Figure 1b As shown: 1) A training task set is constructed according to the meta-learning learning method (specifically, the training task set can be constructed through the problem abstraction module and meta-learning module given above, and the specific implementation process will be given in subsequent embodiments). This training task set contains multiple training tasks, and each training task contains multiple sample pairs of training resources (such as sample short videos) to be distributed (or recommended). Among them, each training resource sample pair includes: multiple training interaction sequence pairs and one test interaction sequence pair; each training interaction sequence pair and test interaction sequence pair contains a historical interaction sequence and a corresponding training resource. It should be noted that the historical interaction sequences contained in the training interaction sequence pair and the test interaction sequence pair may come from different objects; however, the multiple historical resources contained in a single historical interaction sequence belong to the same object, that is, the multiple historical resources contained in a single historical interaction sequence pair are multiple resources triggered sequentially by the same object within a historical time period.
[0098] 2) The sequence model is trained using each training task in the training task set as a training unit. Specifically, the model parameters of the sequence model are continuously adjusted so that the trained sequence model can more accurately recommend resources of interest to the target object. Taking the training of the sequence model using any training task as an example, the training process may include: calling the sequence model to iteratively encode the sample pairs of each training resource in the multiple training resources included in the given training task (specifically, the encoding process can be implemented through the encoding module mentioned above), obtaining the sequence features and encoded features of each training resource set, and thus obtaining the sequence feature set and encoded feature set of the given training task. Taking the encoding process of sample pairs of any training resource as an example, the encoding process may include:
[0099] ① For multiple training interaction sequence pairs contained in a sample pair, long-term preference modeling and short-term preference modeling are performed. The encoding results of long-term and short-term preference modeling are balanced based on an attention mechanism to obtain the sequence features (or sequence representations) of each training resource. The sequence features of any training resource can be used to characterize the sequential dependency relationship between the corresponding training resource and the corresponding historical interaction sequence (i.e., the interaction sequences contained in multiple training interaction sequence pairs corresponding to that training resource), which are triggered in sequence. Sequential dependency can be simply understood as: the triggering of one resource depends on the triggering of another resource. For example, when an object has a need to travel, it often has a series of needs such as buying airline tickets, booking hotels, booking attraction tickets, and renting transportation. After triggering resource 1 (which includes booking airline tickets), the object will often continue to trigger resource 2 (which includes booking hotels) (or resources that include booking attraction tickets and renting transportation). At this point, it is determined that there is a sequential dependency relationship between resource 1 and resource 2; that is, the triggering of resource 2 depends on the triggering of resource 1. In other words, after interacting with resource 1, the object has a high probability of interacting with resource 2.
[0100] ② For the test interaction sequence pairs contained in the sample pairs, long-term preference modeling and short-term preference modeling are performed on the historical interaction sequences other than the corresponding training resources. The encoding results of the long-term and short-term preference modeling are then balanced based on an attention mechanism to obtain the encoding features of each historical interaction sequence. The encoding features of the historical interaction sequences can be used to characterize: the degree of attention the object that generated the historical interaction sequence pays to the sequential dependencies between the historical resources contained in the historical interaction sequence. The degree of attention to sequential dependencies can be understood as: the object that generated the historical interaction sequence's preference for sequential dependencies; in short, the object that generated the historical interaction sequence has an interest in triggering each resource contained in the historical interaction sequence in sequence according to the sequential dependencies. It should be noted that the sequence features and encoding features can be represented in the form of real-valued vectors (embeddings), that is, this can be achieved through encoding processing, using real-valued vectors to represent the interaction sequence pairs.
[0101] 3) Based on the sequence features and encoding features of each training resource within the multiple training resources included in any given training task, a sequence feature set and an encoding feature set are formed. The sequence feature set contains the sequence features of each training resource, and the encoding feature set contains the encoding features of each training resource. Then, each sequence feature in the encoding feature set is compared with all sequence features in the sequence feature set. Based on the similarity calculation result and the loss function, loss information is calculated (this can be implemented using the training module mentioned above). Finally, if the loss information does not meet the loss conditions (e.g., a large difference between consecutive loss information values, or the loss information exceeding a loss threshold), the model parameters of the sequence model are updated according to the loss information. Then, other training tasks are selected to continue training the sequence model with updated model parameters until the sequence model stabilizes (or all training tasks in the training task set are completed).
[0102] Therefore, during model training, on the one hand, meta-learning leverages its ability to learn quickly and perform well even with limited training resources, enabling the training of high-performance sequence models with minimal resources. On the other hand, it supports long-term preference modeling of historical interaction sequences to capture long-term interests (e.g., the sequential dependencies between multiple resources viewed within a month) and short-term preference modeling to capture short-term interests (e.g., the sequential dependencies between multiple resources viewed within a day). Furthermore, it utilizes attention mechanisms to balance the emphasis on long-term and short-term preferences (e.g., a preference for resources in the long-term resource recommendation process), achieving more accurate capture of object interaction preferences. This allows the trained sequence model to precisely recommend resources of interest to the target object.
[0103] (2) Model application.
[0104] Based on the model training process shown in implementation method 1), a trained sequence model can be obtained. This trained sequence model can then be used to encode the candidate resources to be distributed in the database (i.e., the aforementioned long-short-term preference modeling), yielding the sequence features of each candidate resource. The sequence features of any candidate resource can be used to characterize the sequential dependency relationship between that candidate resource and the multiple historical resources contained in its corresponding historical interaction sequence. Furthermore, if the target historical interaction sequence of the target object to be distributed is obtained, and this target historical interaction sequence can include interaction sequences obtained based on the target object's actions triggering multiple historical resources within a historical time period (such as a time period before the current moment), the trained sequence model can be used to encode this target historical interaction sequence, obtaining its encoded features. These encoded features can be used to characterize the target object's level of attention (or simply attention, preference, or bias) to the sequential dependency relationship between the multiple historical resources contained in the target historical interaction sequence. Finally, the encoded features of the target historical interaction sequence are compared with the sequence features of each pre-generated candidate resource to be distributed to determine the target resource that matches the target object's preferences from the candidate resource set; and the target resource is then pushed to the target object.
[0105] Therefore, it can be seen that during the application of the model, the sequence features of each candidate resource to be distributed can be pre-calculated. In this way, after obtaining the target historical interaction sequence of the target object of the resource to be distributed, it is only necessary to encode the target historical interaction sequence and perform subsequent similarity calculations. This can ensure the speed of resource push, guarantee the real-time nature of resource push, and improve the resource distribution experience of the target object to a certain extent.
[0106] To facilitate understanding of the resource recommendation scheme provided in the embodiments of this application, the following is combined with... Figure 2 The resource recommendation system shown here provides a brief introduction to the resource recommendation scenarios involved in the embodiments of this application; such as Figure 2 As shown, the resource recommendation system includes a first terminal 201, a second terminal 202, and a server 203. In this embodiment, the number and naming of the first terminal 201, the second terminal 202, and the computer device 103 are not limited.
[0107] In this embodiment, the first terminal 101 refers to the terminal device used by a resource publisher who publishes or uploads resources on the resource recommendation platform; the second terminal 202 refers to the terminal device used by a resource recipient who receives and browses resources on the resource recommendation platform. It is understood that a resource recipient can also be a resource publisher, and vice versa; this embodiment does not limit this. The terminal devices (such as the first terminal 201 and the second terminal 202) can include, but are not limited to, smartphones (such as smartphones running the Android system or smartphones running the Internetworking Operating System (IOS)), tablet computers, portable personal computers, mobile internet devices (MIDs), in-vehicle devices, head-mounted devices, etc. This embodiment does not limit the type of terminal device. The terminal device deploys a resource recommendation platform, specifically an application that carries the resource recommendation platform; thus, both resource recipients and resource publishers can perform operations such as receiving and publishing resources through the resource recommendation platform deployed on the terminal device.
[0108] Server 203 is the server corresponding to the terminal devices (such as the first terminal 201 and the second terminal 202). Specifically, it is the backend server of the resource recommendation platform deployed in the terminal devices, used to interact with the terminal devices to provide computing and application service support for the resource recommendation platform in the terminal devices. The server can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDN), and big data and artificial intelligence platforms. The terminal devices (such as the first terminal 201 and the second terminal 202) and server 203 can be directly or indirectly connected through wired or wireless communication, which is not limited herein.
[0109] Furthermore, server 203 also includes database 2031, which can be used to store all resources and objects contained in the resource recommendation platform. Specifically, in the sequence recommendation system, the database stores resource interaction data of each object-resource interaction; resource interaction data can refer to the interaction data between objects and resources, including: objects, resources, and the relationship between the object and the resource (e.g., if the object has clicked on the resource, it is determined that there is an interaction relationship between the object and the resource). For example, if Object 1 is an object registered with a video recommendation platform, and Object 1 watched Video 1, Video 2, and Video 3 through the video recommendation platform during a first historical time period, then the historical interaction sequence between Object 1 and the videos that can be stored in the database is: Video 1 → Video 2 → Video 3. If Object 1 watched Video 4, Video 5, and Video 6 through the video recommendation platform during a second historical time period (different from the first historical time period), then the historical interaction sequence between Object 1 and the videos that can be stored in the database is: Video 4 → Video 5 → Video 6. Based on this, the resource interaction data stored in the database under the sequence recommendation system may include: Object 1, Videos 1-6, and the interaction order between Object 1 and each video. Of course, the data format stored in the database may be the interaction sequence mentioned above, or it may be the interaction relationship between objects and resources. This application embodiment does not limit the specific form of the content stored in the database, but this is only described here.
[0110] It should be noted that, based on the time a resource is published or uploaded to the resource recommendation platform, the resources contained in the resource recommendation platform's database 2031 can be divided into cold-start resources and hot-start resources. Cold-start resources refer to resources newly added to the resource recommendation platform, such as videos newly uploaded by video publishers to the video recommendation platform; the amount of resource interaction data for cold-start resources on the resource recommendation platform is less than a data volume threshold. Conversely, hot-start resources refer to resources that have been on the resource recommendation platform for some time; the amount of resource interaction data for hot-start resources on the resource recommendation platform is greater than a data volume threshold. Considering that cold-start resources are newly added to the resource recommendation platform, these resources often lack object-resource interaction data, or have only a small amount of resource interaction data. This means that traditional sequence recommendation systems, which require a large amount of resource interaction data to achieve resource recommendation, cannot accurately recommend cold-start resources to the object recommendation list. In contrast, the embodiments of this application combine meta-learning to train the sequence model. Meta-learning has the characteristic of being able to learn quickly and perform well even with only a small number of training samples, making it very suitable for cold start scenarios where cold start resources have only a small amount of training data. Therefore, in the cold start scenario of resource recommendation, the resource recommendation scheme provided in the embodiments of this application can accurately recommend cold start resources to the recommendation list of the corresponding objects even when there is no resource interaction data or only a small amount of resource interaction data for the cold start resources, thus effectively solving the cold start problem in the context of sequence recommendation.
[0111] In summary, the resource recommendation scheme provided in this application can be applied to resource recommendation scenarios (such as warm start or cold start scenarios), and is particularly effective in cold start scenarios compared to current cold start resource recommendation methods. When the resource recommendation scheme is applied to a cold start scenario, the training resources involved in model training and the candidate resources involved in model application mentioned above are all cold start resources (related concepts can be found in the foregoing descriptions), and this is hereby clarified.
[0112] It should also be noted that the resource recommendation scheme provided in the embodiments of this application can be provided by... Figure 2In the architecture shown, either the terminal device (such as the first terminal 201 and the second terminal 202) or the server can be used for execution, or both can be used together. That is, the execution entity computer device in this embodiment can be at least one of the terminal device or the server. Optionally, the training process of the sequence model mentioned above is executed by the server 203. The trained sequence model can be directly deployed on the server 203, so that each time resources are distributed, the server 203 calls the trained sequence model to implement resource recommendation. In this implementation, the execution entity computer device used to execute the resource recommendation scheme provided in this embodiment is the server 203. Optionally, the trained sequence model can also be deployed on the terminal device. For example, after the sequence model is trained on the server 203, the trained sequence model is sent to the terminal device, and the trained sequence model is deployed on the terminal device. In this case, the execution entity computer device of the resource recommendation scheme provided in this embodiment includes the terminal device and the server. Furthermore, if the training of the sequence model is executed by the terminal device, then the trained sequence model can be directly deployed on the terminal device. In this case, the execution entity computer device of the resource recommendation scheme provided in this embodiment includes the terminal device. Furthermore, when the embodiments of this application are applied to specific products or technologies, such as when recommending candidate resources, it is inevitable to obtain the historical interaction sequence of the target object (such as multiple historical resources triggered sequentially by the target object within a historical time period). Therefore, it is necessary to obtain the permission or consent of the target object, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
[0113] Based on the resource recommendation scheme described above, it can be seen that the resource recommendation scheme provided in this application mainly involves two aspects: one is model training to obtain a trained sequence model, and the other is using the trained sequence model for resource recommendation (i.e., model application). The following, in conjunction with the accompanying drawings, provides a more detailed description of the resource recommendation method proposed in this application, specifically introducing the model training process and model application process included in the resource recommendation method.
[0114] For the specific training process of training the sequence model using the meta-learning method, please refer to [link / reference]. Figure 3 , Figure 3 The illustration shows a flowchart of a resource recommendation method provided by an exemplary embodiment of this application; the resource recommendation method can be executed by the aforementioned computer device; the method may include, but is not limited to, steps S301-S304, wherein:
[0115] S301. Obtain the training task set.
[0116] As described above, the embodiments of this application support training sequence models using a meta-learning approach. Therefore, according to the meta-learning approach, the training task set used to train the sequence model in the embodiments of this application includes multiple training tasks. Each training task contains multiple sample pairs of training resources. In the cold-start short video recommendation scenario, the training resources can be cold-start resources from the video recommendation platform. Furthermore, each sample pair of training resources contains multiple training interaction sequence pairs and one test interaction sequence pair. Specifically, the multiple training interaction sequences contained in the sample pairs of all training resources for any given training resource can form the training set (support set) for that given training task, and the test interaction sequence pairs contained in the sample pairs of all training resources can form the test set (query set) for that given training task. Similarly, the multiple training interaction sequence pairs contained in a sample pair of any training resource can form the support set of that training resource, and the test interaction sequence pairs contained in a sample pair of that training resource can form the query set of that training resource. The query set can be understood as the true labels input in machine learning, used for subsequent comparison of true and predicted values. Furthermore, each training interaction sequence pair and each test interaction sequence pair contains a historical interaction sequence and a corresponding training resource; for example, if a training resource in any training task is denoted as training resource s, then a historical interaction sequence of training resource s can be represented as ζ. s Then the historical interaction sequence ζ s The training interaction sequence pair (or test interaction sequence pair) formed with the training resource s can be represented as (ζ s ,s).
[0117] Based on the above basic introduction to the training task set, the specific implementation process for constructing the training task set is described below; where:
[0118] 1) Supports the use of problem abstraction modules (such as...) Figure 1a As shown, we first define the basic entities and training objectives in the resource recommendation scenario. In the specific implementation, we can first determine the set of basic entities "resources" from the resource recommendation platform, such as the resource set represented as V = {v1, v2, ..., v...}. Nv}, where Nv represents the total number of resources contained in the resource recommendation platform (such as the total number of short videos); similarly, the set of basic entities "objects" is determined from the resource recommendation platform, such as the object set represented as U = {u1, u2, ..., u}. Nu}, where Nu represents the total number of objects contained in the resource recommendation platform (e.g., the total number of resource recipients). Then, any object u in the object set... iThe historical interaction sequence obtained by sequentially triggering multiple historical resources within a historical time period can be represented as ζ. i =(v i,1 ,v i,2 ,…,v i,n This historical interaction sequence indicates that any object u i Within a historical timeframe, the historical resources triggered sequentially are v i,1 →v i,2 →…→v i,n Therefore, the training objective of the sequence model is: based on the historical interaction sequence ζ i Predict any object u i The resource v for the next preference interaction i,n+1 In short, the training objective of a sequence model is equivalent to: training a sequence model based on query interaction sequences (ζ). s (unknown) to predict the next item v i,n+1 .
[0119] 2) A training task set for training the sequence model is constructed based on the meta-learning module; specifically, multiple training tasks are constructed, and these multiple training tasks are combined to obtain the training task set. Considering that in the meta-learning approach, learning is performed on a training task as a learning unit, when applying meta-learning to the embodiments of this application, the sequence model is trained in one round on a unit of training task. Therefore, for ease of explanation, the Tth training task set is referred to as the training task set. i Taking the training task as an example, for the construction of the Tth training task... i The process of constructing the T-th training task is described below, where i is a positive integer. i The construction process of each training task is illustrated by example; wherein:
[0120] In the specific implementation, from the resource set given above, for the Tth... i Each training task collects multiple training resources, and these collected training resources are used as the next click resource. i,n+1 (or simply the next click resource). Then, for each of the multiple training resources, K+1 historical interaction sequences are collected. Specifically, these K+1 historical interaction sequences are determined based on the object set, the resource set, and their resource interaction data. The number of historical interaction sequences collected for each training resource (K+1) can be the same or different; this is not limited. Next, according to the meta-learning filling rules, each training resource is filled into the corresponding K+1 historical interaction sequences, resulting in K+1 historical interaction sequence pairs (ζ) for each training word. iThe filling rule for meta-learning includes: filling any training resource after the last historical resource in each historical interaction sequence of that training resource. Then, from the K+1 historical interaction sequence pairs of each training resource, one historical interaction sequence pair is selected as the test interaction sequence pair for the corresponding training resource, and the K historical interaction sequence pairs other than the test interaction sequence pair are selected as the training interaction sequence pairs for the corresponding training resource. In this way, the K historical interaction sequence pairs and one test interaction sequence pair of each training resource form the Tth training resource. i There are training tasks. Among them, the Tth... i Each training task contains a sample pair of training tasks, which includes K historical interaction sequence pairs and one test interaction sequence pair for the corresponding training task; for example, the sample pair of the target training resource includes: K historical interaction sequence pairs of the target training resource and one test interaction sequence pair.
[0121] Considering the construction process of other training tasks in the training task set and the Tth... i The construction process for each training task is similar; therefore, according to the number of tasks required by the trainers, the above construction process can be performed on each training task that needs to be constructed to obtain multiple training tasks, which can form a training task set.
[0122] It's important to note that each historical interaction sequence belongs to the same object; that is, each historical interaction sequence consists of multiple historical resources triggered sequentially by the same object. However, the multiple historical interaction sequences collected for each training resource can belong to different objects. For example, the K+1 historical interaction sequences mentioned above can belong to different objects, meaning that these K+1 historical interaction sequences can be generated by resources triggered by different objects. By training the model based on historical interaction sequences from different objects, the trained sequence model can adapt to resource recommendations for different objects, improving the universality of the sequence model.
[0123] To better understand the construction process of the training task set given above, the following section will refer to the appendix. Figure 4 The specific implementation process described above will be further explained. The training task set contains multiple training tasks, and any one of the multiple training tasks (such as the Tth training task) i The training process for each training task can be found in [link to training process]. Figure 4 .like Figure 4 As shown, firstly, the resource set V = {v1, v2, ..., v...} provided by the resource recommendation platform... Nv In}, for the Tth i Each training task collects multiple training resources. i,n+1 (As the next clickable resource), such as: v i,7 vi,8 Then, based on the resource interaction data of object-resource interactions stored in the resource recommendation platform, and based on the resource set V = {v1, v2, ..., v...} Nv} and the object set U = {u1, u2, ..., u Nu For each training resource, collect K+1 historical interaction sequences, such as training resource v. i,7 The collected historical interaction sequences may include: historical interaction sequence ζ1 = v i,1 →v i,2 Historical interaction sequence ζ2=v i,3 →v i,4 →v i,3 ..., historical interaction sequence ζ k =v i,6 →v i,5 Historical interaction sequence ζ k+1 =v i,3 →v i,6 Finally, the training resources are populated with the corresponding K+1 historical interaction sequences to form K+1 historical interaction sequence pairs for that training resource. K of these K+1 historical interaction sequence pairs form the training set (support set) for that training resource, and the remaining historical interaction sequence pair is called the test interaction sequence pair, forming the test set (query set, serving as the ground truth label) for that training resource. For example, the training resource v... i,7 Fill the corresponding historical interaction sequence set to obtain K+1 historical interaction sequence pairs, which are: historical interaction sequence pair ζ1=v i,1 →v i,2 →v i,7 Historical interaction sequence pair ζ2=v i,3 →v i,4 →v i,3 →v i,7 ..., historical interaction sequence pairs ζ k =v i,6 →v i,5 →v i,7 Historical interaction sequence pairs ζ k+1 =v i,3 →v i,6 →v i,7 .
[0124] It's worth noting that the training resources used in the aforementioned training task construction process are considered as the next click resource. Therefore, placing this next click resource (i.e., the training resource) into the historical interaction sequence and modeling the historical interaction sequence pairs with the added next click resource helps analyze the sequential dependencies between the next click resource and the historical resources included in the historical interaction sequence, and whether this is what the object prefers. If it is, then the object is more likely to click the next click resource according to the sequential dependencies between the historical resources in the historical interaction sequence pair; conversely, if it is not, then the object is less likely to click the next click resource according to the sequential dependencies between the historical resources in the historical interaction sequence pair. This approach, which assumes that the training resource is the next click resource in the historical interaction sequence, can find highly interactive interaction sequences for the training resource. Furthermore, the sequence model trained in this way can find highly interactive (i.e., more likely to be triggered by the object) interaction sequences for the candidate resources to be distributed, not only achieving accurate recommendations for candidate resources but also improving the click-through rate of candidate resources.
[0125] It should be noted that, in Figure 4 In the task construction process shown, for the Tth... i When collecting training resources for a training task, they are directly collected from the resource recommendation platform's resource set, which includes all resources contained in the platform. However, in practical applications, considering that model training involves more than just the training process, such as model validation or testing, it is often necessary to further divide the resources in the resource set into a training set and a testing set. Furthermore, the amount of resources allocated to the training set is greater than the amount allocated to the testing set. Therefore, for the Tth... i When collecting training resources for a training task, these resources can specifically be collected from the training set used for training, such as... Figure 5 As shown.
[0126] It should also be noted that, in Figure 4 In the task construction process shown, for the Tth... i When collecting training resources for a training task, they are collected from sub-resource sets under the resource recommendation platform's resource set. Depending on the business requirements, the resources included in this sub-resource set will vary. For example, in a cold start scenario, the resources included in this sub-resource set may be all or part of the cold start resources in the resource set; in other words, extracting all or part of the cold start resources from the resource set can form a sub-resource set (such as V). c (set), the sub-resource set V c∈V. Furthermore, similar to the description above, the resources contained in the sub-resource set can be divided into two parts, namely the training subset V. train c (Contains resources for training) and the test subset V test c (Including resources for testing), and the training subset V train c and test subset V test c There are no shared resources between them, i.e., V train c and ∩V train c =0; In this implementation, it is the Tth... i When collecting training resources for a training task, specifically from the training subset V train c Collect training resources to form a training task set T. meta-train ,like Figure 6 As shown.
[0127] S302, Select the Tth task from the training task set. i The training task is performed, and the sequence model is called on the Tth training task. i Encode the sample pairs of each training task contained in the T training task to obtain the Tth training task. i The sequence feature set and the encoded feature set of each training task.
[0128] Based on the specific implementation process shown in step S301, after constructing the training task set for training the model, the sequence model to be trained can be optimized using the training task set. Specifically, the sequence model is gradually optimized by iteratively using the training tasks in the training task set until a sequence model with better performance is obtained, or all training tasks contained in the training task set are executed. For ease of explanation, the Tth training task in the training task set will be used thereafter. i Taking training on a single training task as an example, the training process of the training sequence model is introduced. i It is a positive integer, as specifically stated here.
[0129] In specific implementation, this application embodiment supports calling the sequence model for the Tth... i Encode the sample pairs (i.e., multiple training interaction sequence pairs and one test interaction sequence pair) of each training resource in the training task to obtain the Tth training resource. i The training task consists of a set of sequence features and a set of encoded features. Specifically, this can be achieved using an encoding module (such as...). Figure 1a As shown), for each training resource sample pair included in the training task, there are multiple training interaction sequence pairs (ζ).s v i,n+1 This process yields the sequence features of each training resource; and for each test interaction sequence pair, it generates the historical interaction sequences (ζ) excluding the corresponding training resources. s The sequence features (i.e., unknown) are encoded to obtain the encoded features of each training resource. Both encoded features and sequence features are in the form of embeddings (fixed-length real-valued vectors representing the characteristics of resources or objects), enabling the representation of sequence pairs (training interaction sequence objects and test interaction sequence pairs) through these vectors. The sequence feature set contains the sequence features of each training resource. The sequence features of any training resource are obtained by aggregating the encoded features of each training interaction sequence pair contained in the sample pairs of that training resource. The encoded features of each training interaction sequence pair are obtained by encoding that training interaction sequence pair. The encoded feature set contains the encoded features of each training resource. The encoded features corresponding to any training resource are obtained by encoding the test interaction sequences contained in the sample pairs of that training resource.
[0130] For example, the Tth i Each training task contains three training resources: training resource 1, training resource 2, and training resource 3. Each training resource's sample pairs include two training interaction sequence pairs and one test interaction sequence pair. Therefore, ① the generated T-th... i The sequence feature set for each training task includes: sequence features of training resource 1, sequence features of training resource 2, and sequence features of training resource 3. Specifically, the sequence features of training resource 1 are obtained by aggregating the encoded features of the two training interaction sequence pairs contained in the sample pairs of training resource 1; the sequence features of training resource 2 are obtained by aggregating the encoded features of the two training interaction sequence pairs contained in the sample pairs of training resource 2; similarly, the sequence features of training resource 3 are obtained by aggregating the encoded features of the two training interaction sequence pairs contained in the sample pairs of training resource 3. ② The generated T-th... i The set of encoded features for each training task includes: encoded features of training resource 1, encoded features of training resource 2, and encoded features of training resource 3. The encoded features of training resource 1 are obtained by encoding one test interaction sequence pair contained in the sample pairs of training resource 1; the encoded features of training resource 2 are obtained by encoding one test interaction sequence pair contained in the sample pairs of training resource 2; similarly, the encoded features of training resource 3 are obtained by encoding one test interaction sequence pair contained in the sample pairs of training resource 3.
[0131] As described above, the sequence features of any training resource can be used to characterize the sequential dependency relationship between the training resource and each historical resource contained in the historical interaction sequence, which is triggered in turn. In other words, sequence features can be obtained by encoding (i.e., modeling) the training resource (specifically, the multiple training interaction sequence pairs and test interaction sequence pairs contained in the sample pairs of the training resource). These sequence features can then be used to analyze the preference features in the interaction sequence of an object, which is beneficial for more personalized resource recommendations based on these preference features. This application does not limit the specific implementation of encoding the interaction sequence (i.e., training interaction sequence pairs and test interaction sequence pairs). To better capture the preference characteristics in the interaction sequence, this application provides an exemplary modeling method, which includes performing long-term preference modeling (analyzing the long-term interests of the object) and short-term preference modeling (analyzing the short-term interests of the object) on the interaction sequence respectively, to fully explore the object's interest resources over a longer period and within a shorter period, thereby facilitating the recommendation of resources that better match the object's interests.
[0132] For ease of explanation, let's use the Tth... i Taking any training resource in a training task as the target training resource and the corresponding sample pair as the target sample pair as an example, this paper presents the process of calling a sequence model to encode the target sample pair to obtain the sequence features and encoded features of the target training resource. Specifically, it describes the implementation process of obtaining the sequence features of the target training resource by encoding multiple training interaction sequence pairs contained in the target sample pair, and encoding the test interaction sequence pairs contained in the target sample pair to obtain the encoded features of the target training resource.
[0133] (1) The process of encoding multiple training interaction sequence pairs may include steps s11-s12:
[0134] s11: Encode each training interaction sequence pair among the multiple training interaction sequence pairs contained in the target sample pair to obtain the encoded features of each training interaction sequence pair.
[0135] As described above, the encoding process for training interaction sequence pairs can include short-term preference modeling and long-term preference modeling; the specific implementation process of the two models will be introduced below.
[0136] 1) Short-term preference modeling.
[0137] Specifically, this application supports short-term preference modeling for each of the multiple training interaction sequence pairs contained in a target sample pair, obtaining short-term encoded features for each training interaction sequence pair. In other words, it supports short-term preference modeling for each training interaction sequence pair to capture the short-term preferences of the objects that generated the historical interaction sequences contained in each training interaction sequence pair. This application does not limit the specific modeling method for short-term preference modeling; for example, this application supports using a Long Short-Term Memory (LSTM) network to implement short-term preference modeling; relevant content regarding LSTM networks can be found in the foregoing description and will not be repeated here.
[0138] The specific formula for modeling short-term preferences for a single training interaction sequence pair using a Long Short-Term Memory (LSTM) network is as follows:
[0139] f k =σ(x k W f1 +h k-1 W f2 +b f (1)
[0140] i k =σ(x k W i1 +h k-1 W i2 +b i (2)
[0141]
[0142]
[0143] o k =σ(x k W o1 +h k-1 W o2 +b o (5)
[0144] h k =o k ⊙Φ(c k (6)
[0145] In the formula, W * W represents the learning parameters of the sequence model (i.e., the model parameters mentioned above). * ∈R D*D R D*D Let b represent a D-order square matrix, where each component of the matrix is a real number. * Let b represent the bias vector. * ∈R D R DLet f represent a D-dimensional vector, where each component of the vector is a real number. k i k and o k These are the forgetting gate, update gate, and output gate of the Long Short-Term Memory (LSTM) network. Through these three gates, the LTM network can remember important information in a sequence while forgetting unimportant information, enabling selective memorization and thus better handling long sequence problems. k Representing the cell state, in the Long Short-Term Memory network, the cell state is like a conveyor belt, which can transmit data on the data chain of the entire Long Short-Term Memory network. In this way, each gating unit can perform operations such as adding and deleting data on the cell state. This represents the newly learned cell state. σ is the sigmoid function, and Φ is the tanh function; the sigmoid and tanh functions are two types of activation functions. k This represents the embedding of the k-th resource in the training interaction sequence pair currently being modeled for short-term preferences. The embedding of the resource is pre-trained and only needs to be obtained during model training. k This indicates a short-term preference, and in the embodiments of this application, it is supported to adopt... To represent short-term preferences, i.e.
[0146] From the formulas (1)-(6) given above, it is easy to see that short-term preference modeling for a single training interaction sequence pair is a cumulative calculation process. Specifically, the short-term preferences of each resource are calculated sequentially according to the order of the resources contained in the training interaction sequence pair; except for the first resource in the training interaction sequence pair, the short-term preference of each subsequent resource depends on the short-term preference result of the previous resource. For example, such as Figure 7 As shown, the training interaction sequence pair is (historical resource 1, historical resource 2, training resource). Then, we can first calculate the resource v in the training interaction sequence pair according to formula (1)-formula (6). i,1 The short-term preference is calculated, and the relevant parameters of the previous resource in formulas (1)-(6) are set to default values. Then, the short-term preference of historical resource 1 is used, and the short-term preference of historical resource 2 in the training interaction sequence pair is calculated according to formulas (1)-(6). Then, the short-term preference of historical resource 2 is used, and the short-term preference of training resources in the training interaction sequence pair is calculated according to formulas (1)-(6). Finally, the short-term preference of the last training resource in the training interaction sequence pair is used as the short-term encoding feature of the entire training interaction sequence pair. The short-term encoding feature is used to represent the degree of preference (or degree of attention) for the sequential dependency relationship between the resources (historical resource 1, historical resource 2 and training resources) contained in the training interaction sequence pair in the short term.
[0147] For each of the multiple training interaction sequence pairs contained in the target sample pair, performing the above short-term preference modeling can yield the short-term encoding features of each training interaction sequence pair. By performing short-term preference modeling on the training interaction sequence pairs, the short-term preferences of the object can be captured better, which is beneficial for recommending resources of interest to the object in the short term.
[0148] 2) Long-term preference modeling.
[0149] Specifically, this application supports long-term preference modeling for each of the multiple training interaction sequence pairs contained in a target sample pair, obtaining the long-term encoded features of each training interaction sequence pair. In other words, it supports long-term preference modeling for each training interaction sequence pair to capture the long-term preferences of the objects that generated the historical interaction sequences contained in each training interaction sequence pair. This application does not limit the specific modeling method for long-term preference modeling; exemplarily, this application supports adding self-learning weights to each resource (historical resources and training resources) in the training interaction sequence pair through an attention mechanism to obtain the long-term encoded features of the training interaction sequence pair. The attention mechanism is an algorithm that selectively focuses on certain parts while ignoring other information.
[0150] Based on the attention mechanism, the specific formula for determining the weight of each resource in the training interaction sequence pair is as follows:
[0151]
[0152]
[0153]
[0154] In the formula, W v W represents the learning parameters of the sequence model (i.e., the model parameters mentioned above). v ∈R D*D R D*D Let b represent a D-order square matrix, where each component of the matrix is a real number. v Let b represent the bias vector. v ∈R D ,τ∈R D R D Let a represent a D-dimensional vector, where each component of the vector is a real number. k This represents the weight of the k-th resource in the training interaction sequence pair. The long-term encoding feature represents the long-term preference for the training interaction sequence pair. Specifically, it is the sum of the products of the resources and weights contained in the training interaction sequence pair. The long-term encoding feature is used to represent the degree of preference for the order dependencies between the resources contained in the training interaction sequence pair over the long term.
[0155] Based on implementation methods 1) and 2), after determining the short-term and long-term encoding features of each training interaction sequence pair, this embodiment further connects the short-term and long-term encoding features of the training interaction sequence pair to obtain the encoding features of the training interaction sequence pair. Considering that the preference representation of the training interaction sequence pair may have different emphases at different times (e.g., long-term and short-term), such as objects preferring long-term representations over short-term representations, this embodiment uses an attention mechanism to connect (or fuse) the short-term and long-term encoding features. The formula used to connect the long-term and short-term encoding features of a single training interaction sequence pair based on the attention mechanism is as follows:
[0156]
[0157]
[0158] In the formula, σ is the sigmoid function; α is the weight of the short-term coding features of the training interaction sequence pair; and (1-α) is the weight of the long-term coding features of the training interaction sequence pair.
[0159] In summary, based on the specific implementation process of short-term preference modeling, long-term preference modeling, and feature connection described above, the encoding features of each training interaction sequence pair in the target sample pair can be determined, so as to determine the sequential dependency relationship between each historical resource and each training resource.
[0160] s12: Aggregate the encoded features of each training interaction sequence pair to obtain the sequence features corresponding to the target training resource.
[0161] Based on step s11, the encoding features of each training interaction pair contained in the target sample pair can be obtained; thus, the encoding features of each training interaction sequence pair contained in the target sample pair can be represented as a set. k indicates that the target template contains a total of k training interaction sequence pairs.
[0162] Furthermore, it is necessary to aggregate the encoded features of each training interaction sequence pair contained in the same sample pair to obtain the sequence features of the corresponding training resource (such as the target training resource corresponding to the target sample pair). In other words, it supports aggregating the encoded features of multiple training interaction sequence pairs with the same next-click resource to determine the sequence features of the corresponding training resource. Specifically, the implementation supports using aggregation functions to aggregate the encoded features of each training interaction sequence pair contained in the target sample pair. These aggregation functions can include mean pooling, max pooling, stochastic pooling, and global average pooling. Mean pooling involves averaging multiple features (such as encoded features) within a neighborhood; max pooling involves maximizing multiple features within a neighborhood; stochastic pooling involves randomly selecting multiple features within a neighborhood based on their probability values; and global average pooling involves averaging all features within a neighborhood. These averaged features are then input into an activation function (such as the sigmoid function) to obtain the score for each category.
[0163] This application does not limit the specific aggregation function selected for aggregating the encoded features of multiple training interaction sequence pairs. For example, this application can use mean pooling to aggregate the encoded features of each training interaction sequence pair, obtaining pooled feature a. i =pooling(P i The pooling feature is used to represent the sequence features of the support set contained in the target training resource.
[0164] Furthermore, the Tth i A training task contains multiple training resources. After executing the specific implementation process shown in steps s11-s12 above for multiple training interaction sequence pairs in the sample pairs of each training resource, the sequence features of each training resource can be obtained. These sequence features can form the Tth sequence. i A set of sequence features for each training task. An example of generating the Tth... i For an overall flowchart of the sequence feature set for each training task, please refer to [link / reference]. Figure 8 .
[0165] (2) The process of encoding the test interaction sequence pairs may include steps s21-s22:
[0166] s21: Perform short-term encoding on a test interaction sequence pair contained in the target sample pair to obtain the short-term encoding features of the test interaction sequence pair; and perform long-term encoding on the test interaction sequence pair to obtain the long-term encoding features of the test interaction sequence pair.
[0167] It should be noted that the specific implementation process of short-term encoding and long-term encoding of test interaction sequence pairs in this application embodiment is similar to the specific implementation process of short-term encoding and long-term encoding of training interaction sequence pairs; for details, please refer to the relevant description of the specific implementation process of short-term encoding and long-term encoding of training interaction sequence pairs, which will not be repeated here.
[0168] It should also be noted that the training objective of the sequence model in this application embodiment is that the trained sequence model can accurately predict the next click resource for the interaction sequence pair; therefore, when encoding the test interaction sequence pair, specifically, the historical interaction sequences other than the target training resource in the test interaction sequence pair contained in the target sample pair are encoded to obtain the encoded features corresponding to the target training resource. That is, the test interaction sequence pair (ζ) s v i,n+1 The target training resources v included i,n+1 It does not participate in the encoding process, only the sequence (ζ) s , ? ) participate in the encoding process. The target sample pair contains the test interaction sequence pair (ζ s v i,n+1 ) needs to be used as the real label, so that the real label can be compared with the sequence (ζ) later. s The prediction results of , ? are compared, and the sequence model is continuously adjusted based on the comparison results, so that when the adjusted sequence model predicts the test interaction sequence pair, it can more accurately predict the target training resources contained in the test interaction sequence pair.
[0169] s22: The short-term and long-term coding features of the test interaction sequence pair are fused to obtain the coding features of the test interaction sequence pair. The coding features of the test interaction sequence pair are directly used as the coding features corresponding to the target training resource.
[0170] It is easy to understand that, as described above, the target sample pair contains only one test interaction sequence pair. Therefore, it is only necessary to perform short-term encoding and long-term encoding on this single test interaction sequence pair. Furthermore, by fusing the short-term encoded features obtained from the short-term encoding and the long-term encoded features obtained from the long-term encoding, the encoded features corresponding to the target training resource can be directly obtained. This eliminates the need for the encoding feature aggregation operations performed as in calculating the sequence features corresponding to the target training resource. The specific implementation process for connecting (or fusing) the short-term and long-term encoded features of the test interaction sequence pair can be found in the aforementioned description of the specific implementation process for connecting the short-term and long-term encoded features of the training interaction sequence pair, and will not be repeated here.
[0171] Furthermore, the Tth i A training task contains multiple training resources. After executing the specific implementation process shown in steps s21-s22 above for the test interaction sequence pairs in the sample pairs of each training resource, the encoded features of each training resource can be obtained. These encoded features can form the Tth... i The set of encoded features for each training task. An exemplary method for generating the Tth... i The overall flowchart for the encoded feature set of each training task can be found here. Figure 9 .
[0172] In summary, based on the foregoing descriptions, a sequence feature set and an encoding feature set corresponding to a training task can be determined. The sequence feature set contains the sequence features of each training resource among the multiple training resources included in the training task. Similarly, the encoding feature set contains the encoding features of each training resource among the multiple training resources included in the training task.
[0173] S303, Based on the sequence feature set, the encoding feature set and the Tth... i For each training task, multiple test interaction sequence pairs are used to calculate the loss information; and the model parameters of the sequence model are updated in the direction of decreasing loss information to obtain the updated sequence model.
[0174] As described above, the training objective of a sequence model is to predict the next potentially clicked resource based on historical interaction sequences. Therefore, after obtaining the set of encoded features (or test feature set) corresponding to the target training resource, it is necessary to perform similarity calculations (or operations) between each encoded feature in this set and all sequence features in the sequence feature set. This allows for the assessment of the sequence model's prediction performance based on the similarity calculation results. Specifically, if the similarity calculation result indicates that the encoded features of the test interaction sequence pair in the same sample pair have a high similarity to the sequence features of multiple training interaction sequence pairs, the current sequence model's prediction performance is good. Conversely, if the similarity calculation result indicates that the encoded features of the test interaction sequence pair in sample pair 1 have a high similarity to the sequence features of multiple training interaction sequence pairs in other sample pairs, but a low similarity to multiple training interaction sequence pairs in the same sample pair, the current sequence model's prediction performance is poor.
[0175] In practice, ① each encoded feature in the encoded feature set is compared with each sequence feature in the sequence encoded features to obtain multiple similarity results. For example, the following similarity formula can be used for similarity calculation:
[0176]
[0177] Among them, B b A represents the b-th coded feature in the set of coded features; a sim(B) represents the a-th sequence feature in the sequence feature set; b A a ) represents the similarity result between the b-th encoded feature in the encoded feature set and the a-th sequence feature in the sequence feature set, sim(B b A a ) can also be represented as c ij That is, c ij =sim(B b A a ).
[0178] Therefore, for the same encoded feature in the encoded feature set, there are multiple similarity calculation results; that is, there will be a similarity calculation result between an encoded feature and each sequence feature in the sequence feature set. To facilitate the calculation of loss information, it is also possible to normalize the multiple similarity calculation results for each encoded feature. It is in vector form, where, n is the sum of sequence features in the sequence feature set.
[0179] ② Further, the training resources contained in the test interaction sequence pairs corresponding to each encoded feature in the encoded feature set are compared with the Tth... i Each training task's corresponding test interaction sequence pair contains training resources, resulting in multiple comparison results. For example, the actual next-click resource in the test interaction sequence pair corresponding to the b-th encoded feature in the encoded feature set is compared with the next-click resource in the training interaction sequence pair corresponding to the a-th sequence feature in the sequence feature set, yielding a comparison result. This comparison result indicates the actual next-click resource in the test interaction sequence pair corresponding to the b-th encoded feature, compared with the next-click resource in the training interaction sequence pair corresponding to the T-th encoded feature. i The training resources contained in each test interaction sequence pair corresponding to each training task are the same; or, the comparison result indicates that the real next click resource in the test interaction sequence pair corresponding to the b-th encoded feature is the same as that in the T-th feature. i The training resources included in each test interaction sequence pair corresponding to each training task are different.
[0180] ③ Based on the aforementioned similarity calculation results and corresponding comparison results, as well as the loss function, loss information is obtained. The loss function can be expressed as:
[0181]
[0182] Among them, y ba This represents the comparison result between the training resources in the test interaction sequence pair corresponding to the b-th encoded feature in the encoded feature set and the training resources in the training interaction sequence pair corresponding to the a-th sequence feature in the sequence feature set. If the comparison result indicates that the actual next click resource in the test interaction sequence pair corresponding to the b-th encoded feature is equal to the training resource in the T-th sequence pair... i If each test interaction sequence pair corresponding to a training task contains the same training resources, then y ba =. If the comparison result indicates that the actual next-click resource in the test interaction sequence pair corresponding to the b-th encoded feature is the same as the T-th feature... i If each test interaction sequence pair corresponding to a training task contains the same training resources, then y ba =0.
[0183] Based on the above steps ①②③, we obtain the result using the Tth... iAfter training the sequence model using a training task and obtaining the loss information, it can be further determined whether the loss information meets the training termination condition. If it does, the sequence model obtained in this training is taken as the trained sequence model. If it does not meet the condition, the model parameters of the sequence model can be optimized based on the loss information to obtain an updated sequence model, and step S304 is executed to continue training the updated sequence model. The training termination condition may include: all training tasks in the training task set have been executed; or the difference between the loss information of adjacent training iterations approaches 0; or the loss information obtained in this training approaches a preset value (such as 1 or 0).
[0184] S304, Reselect the Tth task from the training task set. i+1 The training task is t, and the Tth training task is used. i+1 Each training task iteratively trains the updated sequence model until the sequence model tends to stabilize.
[0185] It is understandable that the Tth task is selected from the training task set. i+1 After the training task, the Tth training task is used. i+1 The specific implementation process of training the updated sequence model for each training task is related to using the Tth training task. i The specific implementation process for training the sequence model for each training task is the same; for details, please refer to the relevant descriptions of the specific implementation process shown in steps S302-S303 above, which will not be repeated here.
[0186] Based on the specific implementation process shown in steps S301-S304 above, the sequence model can be trained using meta-learning methods and related technologies such as Long Short-Term Memory networks to obtain a trained sequence model. This trained sequence model can effectively predict the next clicked resource for a target object based on its historical interaction sequence, achieving accurate resource recommendation. The general process of calling the trained model for resource recommendation includes: obtaining a set of candidate resources to be recommended, containing M candidate resources and the sequence features of each candidate resource; then, obtaining the encoding features of the target object's historical interaction sequence; finally, determining the target resources whose sequence features match the encoding features of the target object's historical interaction sequence from the candidate resource set, and recommending the target resources to the target object. For a detailed description of the model application process in subsequent embodiments, please refer to the relevant descriptions; they will not be elaborated upon here.
[0187] In this application embodiment, on the one hand, it supports constructing training interaction sequence pairs and test interaction sequence pairs according to the meta-learning learning method to train the sequence model. This allows the use of meta-learning's ability to learn quickly and perform well even with limited training resources in resource recommendation scenarios (such as cold start scenarios), enabling the training of a high-performance sequence model with limited training resources. On the other hand, it supports long-term preference modeling of an object's historical interaction sequence to capture the object's long-term interests (such as the sequential dependencies between multiple resources browsed within a month), and short-term preference modeling of an object's historical interaction sequence to capture the object's short-term interests (such as the sequential dependencies between multiple resources browsed within a day). Furthermore, it utilizes an attention mechanism to balance the emphasis on long-term and short-term preferences (such as a greater preference for resources in the long-term resource recommendation process) to more accurately capture the object's interaction preferences, thereby enabling the trained sequence model to accurately push resources of interest to the target object.
[0188] The above Figure 3 The illustrated embodiment mainly provides the specific implementation process of the model training part included in the resource recommendation method. The following section will combine... Figure 10 This paper presents the specific implementation process of the model application component of the resource recommendation method. Figure 10 The illustration shows a flowchart of a resource recommendation method provided by an exemplary embodiment of this application; the resource recommendation method can be executed by the aforementioned computer device; the method may include, but is not limited to, steps S1001-S1003, wherein:
[0189] S1001. Obtain the set of candidate resources to be recommended.
[0190] The candidate resource set contains M candidate resources to be distributed, and the sequence features of each candidate resource. Specifically: ① In a cold start scenario, these M candidate resources can be cold start resources on a resource recommendation platform (such as newly uploaded short videos); of course, all M candidate resources can be cold start resources or some can be cold start resources, without limitation. ② Taking the j-th candidate resource among the M candidate resources as an example, this j-th candidate resource corresponds to Q... j Given a sequence of historical interactions, the sequence features of the j-th candidate resource are based on meta-learning and attention mechanisms, and are applied to Q. j The interaction sequence is encoded using the historical interaction sequence and the j-th candidate resource; specifically, it is obtained by calling the sequence model trained in the previous steps and performing calculations on the j-th candidate resource. The sequence features of the j-th candidate resource can be used to characterize: the j-th candidate resource and Q... jThe sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M.
[0191] In practical implementation, when there is a need to distribute resources (such as a large number of cold-start resources in a resource recommendation platform), M candidate resources to be recommended can be obtained from the resource recommendation platform. Furthermore, multiple historical interaction sequences are collected for each candidate resource. These multiple historical interaction sequences can be generated by the same or different objects, and any historical interaction sequence is an interaction sequence obtained by the same object sequentially triggering operations on multiple historical resources within a historical time period. Then, based on the multiple historical interaction sequences corresponding to each candidate resource, each candidate resource is encoded to obtain the sequence features of each candidate resource. Finally, the M candidate resources to be recommended and the sequence features of each candidate resource are combined to form a set of candidate resources to be recommended.
[0192] It should be noted that the process of determining the sequence features of each of the M candidate resources in the above-described implementation of constructing the candidate resource set to be recommended is the same as the process of determining the sequence features of each candidate resource mentioned in the aforementioned model training. The following only provides a general outline of the process for determining the sequence features of candidate resources; for detailed implementation, please refer to the aforementioned... Figure 3 The relevant descriptions of the content shown are as follows. The process of determining the sequence features of candidate resources generally includes:
[0193] (1) According to the filling rules of meta-learning, each candidate resource is filled into multiple corresponding historical interaction sequences to obtain multiple historical interaction sequence pairs for each candidate resource; a historical interaction sequence pair includes a historical interaction sequence and a candidate resource. In other words, according to the learning method of meta-learning, historical interaction sequence pairs can be constructed based on historical interaction sequences and candidate resources to be recommended; and subsequent operations can be performed based on historical interaction sequence pairs. Among them, the filling rules of meta-learning may include: placing the candidate resource after the last historical resource in the historical interaction sequence; for example, any historical interaction sequence in the multiple historical interaction sequences collected for the j-th candidate resource can be represented as (historical resource 1, historical resource 2, historical resource 3), indicating that the same object triggered historical resource 1→historical resource 2→historical resource 3 in sequence during the historical time. Then, according to the filling rules of meta-learning, the filled historical interaction sequence pair is represented as (historical resource 1, historical resource 2, historical resource 3, j-th candidate resource).
[0194] (2) Encode multiple historical interaction sequence pairs for each candidate resource to obtain the sequence features of each candidate resource. The process of determining the sequence features of each candidate resource is the same; the following uses the j-th candidate resource as an example to illustrate the general implementation process of determining the sequence features of the candidate resource, wherein:
[0195] ① For each historical interaction sequence pair among the multiple historical interaction sequence pairs of the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoding features of each historical interaction sequence pair. Specifically, short-term preference modeling is performed on each historical interaction sequence pair among the multiple historical interaction sequence pairs of the j-th candidate resource to obtain the short-term encoding features of each historical interaction pair; and long-term preference modeling is performed on each historical interaction sequence pair among the multiple historical interaction sequence pairs of the j-th candidate resource to obtain the long-term encoding features of each historical interaction sequence pair. The specific implementation process for short-term and long-term preference modeling of historical interaction sequence pairs can be found in the preceding section. Figure 3 The specific implementation process of short-term and long-term preference modeling for training interaction sequence pairs in the model training shown is not detailed here. Then, based on an attention mechanism, the short-term and corresponding long-term encoding features of each historical interaction sequence pair are fused to obtain the encoding features of each historical interaction sequence pair; specifically, the short-term and long-term encoding features of the same historical interaction sequence pair are concatenated to obtain the encoding features of that historical interaction sequence pair. The specific implementation process of fusing the short-term and long-term encoding features of the same historical interaction sequence pair based on the attention mechanism can be found in the aforementioned... Figure 3 The specific implementation process of fusing short-term and long-term encoding features of training interaction sequence pairs in the model training shown is not described in detail here.
[0196] ② Aggregate the encoded features of each historical interaction sequence pair to obtain the sequence features of the j-th candidate resource. It should be noted that the specific implementation process of the aggregation process can be found in the preceding section. Figure 3 The specific implementation process of aggregating the encoded features of each training interaction sequence pair in the illustrated embodiment will not be elaborated here. For example, if the aggregation function mean-pooling is used, the specific implementation process may include: performing mean pooling on the encoded features of multiple historical interaction sequence pairs of the j-th candidate resource to obtain pooled features; and then using the pooled features as the sequence features of the j-th candidate resource. Of course, if other functions are used for aggregation, the specific implementation process of aggregation can be adaptively changed, which will be explained here.
[0197] Based on the above description, a set of candidate resources to be recommended can be determined. It's worth noting that this set of candidate resources is automatically generated by the backend (e.g., a server). After the candidate resource set is generated, when a resource distribution request (such as a distribution request sent by any object to request resource distribution) is subsequently received, the target resource to be distributed can be directly determined from the generated candidate resource set. By pre-generating the candidate resource set, it is beneficial to respond quickly to resource distribution requests and promptly recommend the target resource to the object, improving both the speed of resource recommendation and the user experience.
[0198] It is easy to understand that the resources in a resource recommendation platform are constantly changing; for example, new resources may be added at any time, and resources may be deleted by the uploader at any time. Therefore, this application embodiment also supports automatically and dynamically updating the candidate resource set to be recommended, specifically updating the candidate resources and their sequence features. This ensures that each time a resource distribution request is received, resource distribution can be performed based on the latest candidate resource set, meeting the uploader's need for rapid resource release. For example, when the resource recommendation platform receives a newly added resource X, it can call a trained sequence model to encode the resource X and obtain its sequence features. Then, the resource X and its sequence features are added to the candidate resource set to be recommended, thus updating the candidate resource set. Of course, considering that the interaction preferences of objects may change at different times, this application embodiment also supports automatically updating the candidate resources to be recommended based on the dynamic changes of objects (such as changes in the number of objects or changes in the interaction sequences generated by objects), ensuring that the resource recommendation platform can accurately grasp the interaction preferences of objects and recommend resources of interest to them.
[0199] S1002. Obtain the encoding features of the target historical interaction sequence of the target object to be recommended.
[0200] When a target object has a need to acquire resources, the resource recommendation platform can receive a resource distribution request sent by the target object. In response to this request, the platform can capture the target object's interaction preferences to provide personalized resource recommendations. Optionally, the resource distribution request may be automatically generated when the target object starts the platform; alternatively, it may be generated when the target object performs a refresh operation on the platform's resource display interface. This embodiment does not limit the specific method of generating the resource distribution request.
[0201] In the specific implementation, in response to the acquired resource distribution request, the target historical interaction sequence of the target object to be recommended can be obtained. It should be noted that, in order to distribute the recently interested resources to the target object, the acquired target historical interaction sequence is often generated by the target object in the most recent time period. For example, if the target object generates a resource distribution request by performing a refresh operation in the resource display interface, then the target historical interaction sequence can include: each historical resource triggered sequentially by the target object in the resource display interface within a period of time before this refresh operation. Then, the target historical interaction sequence is encoded to obtain its encoded features. The specific encoding process can include: performing short-term preference modeling on the target historical interaction sequence to obtain its short-term encoded features, and performing long-term preference modeling on the target historical interaction sequence to obtain its long-term encoded features. Then, based on an attention mechanism, the short-term and long-term encoded features of the target historical interaction sequence are fused to obtain the final encoded features of the target historical interaction sequence. The above only gives a general implementation flow of the encoding process; the specific implementation details of the encoding process can be found in the aforementioned descriptions and will not be repeated here.
[0202] The encoding features of the target historical interaction sequence can be used to characterize the degree of attention a target object pays to the sequential dependencies between multiple historical resources contained in the sequence. In other words, the encoding features of the target historical interaction sequence can be used to represent that the target object has a preference for the sequential dependencies between multiple historical resources contained in the sequence. For example, if the target historical interaction sequence includes (flight tickets, hotel resources), then the encoding features of this sequence can be used to represent that the target object is interested in the sequential dependency between flight tickets and hotel resources. This suggests that the target object may have a need for business or leisure travel, and based on this sequential dependency, it can be inferred that the next resource clicked by the target object may be a tourist attraction. By capturing the target object's interaction preferences, not only can accurate resource recommendations be achieved based on these preferences, but the resource distribution needs of the target object can also be met, thus improving the target object's experience to a certain extent.
[0203] S1003. Determine the sequence features from the candidate resource set, identify the target resources that match the encoding features of the target object's target historical interaction sequence, and recommend the target resources to the target object.
[0204] After capturing the interaction preferences of the target object based on the aforementioned steps, in order to find the next clickable resource that the target object is most likely to interact with, it supports similarity matching of the sequence features of each candidate resource in the candidate resource set, so as to select the sequence features in the candidate resource set that are similar to the interaction preferences of the target object, and recommend the candidate resources corresponding to the similar sequence features as target resources to the target object, so as to recommend resources of interest to the target object and improve the interactivity of the target resources.
[0205] In the specific implementation, the encoded features of the target historical interaction sequence are compared with the sequence features of each candidate resource in the candidate resource set to obtain M similarity calculation results (the candidate resource set contains M candidate resources). Then, from the M similarity calculation results, one or more target similarity calculation results with similarity calculation results greater than a similarity threshold are determined. Subsequently, the sequence features corresponding to each target similarity calculation result are determined as target sequence features that match the encoded features of the target historical interaction sequence. Finally, the candidate resources corresponding to the target sequence features are taken as target resources. An exemplary diagram illustrating the determination of target resources that match the interaction preferences of the target object from the candidate resource set can be found in [reference missing]. Figure 11 .
[0206] It is understood that the number of candidate resources identified as target resources varies depending on the value of the similarity threshold; this application embodiment does not limit the specific value of the similarity threshold. For example, it supports determining the target similarity calculation result with the highest similarity to the encoded features indicating the target historical interaction sequence from M similarity calculation results; and determining the sequence features associated with the target similarity calculation result as target sequence features that match the encoded features of the target historical interaction sequence, and using the candidate resources corresponding to the target sequence features as target resources.
[0207] This application supports determining sequence features for each candidate resource in a set of candidate resources to be recommended based on meta-learning, attention mechanisms, and historical interaction sequences (containing multiple historical resources triggered sequentially within a historical time period). The sequence features of any candidate resource can be used to characterize the sequential dependency relationship between that candidate resource and each historical resource in the historical interaction sequence, such as the triggering of resource 2 (bait) depending on the triggering of resource 1 (hook). This allows for the pre-analysis of the triggering sequence dependency relationship between each candidate resource and historical resources. During resource recommendation, after obtaining the target historical interaction sequence of the target object of the resource to be recommended, the target interaction object's attention to a certain sequential dependency relationship (or preference, preference, etc.) can be represented by the encoding features of the target historical interaction sequence. The target resource with the highest similarity can be matched from the sequence features of M candidate resources, where M is a positive integer. In other words, the target resource that best matches the target object's interaction preferences is found from the M candidate resources. This facilitates recommending potentially interesting target resources to the target object and improves the click-through rate of the target resources. In summary, this application provides a novel resource recommendation scheme that supports the analysis of the sequential dependencies between resources (including candidate resources to be recommended) and, based on the interaction preferences reflected in the target object's historical interaction sequence, determines the target resource that best matches the target object's interaction preferences from multiple candidate resources to be recommended. This enables more accurate delivery of resources of interest to the target object and improves the target object's resource recommendation experience.
[0208] The methods of the embodiments of this application have been described in detail above. In order to facilitate better implementation of the above solutions of the embodiments of this application, the apparatus of the embodiments of this application is provided below.
[0209] Figure 12 This illustration shows a schematic diagram of a resource recommendation device provided in an exemplary embodiment of this application; the resource recommendation device can be used as a computer program (including program code) running in a computer device; the resource recommendation device can be used to execute... Figure 3 as well as Figure 10 Some or all of the steps in the method embodiments shown. Please refer to [link / reference]. Figure 12 The resource recommendation device includes the following units:
[0210] The acquisition unit 1201 is used to acquire a set of candidate resources to be recommended. The set of candidate resources contains M candidate resources and the sequence features of each candidate resource; the j-th candidate resource corresponds to Q. j Given a sequence of historical interactions, the sequence features of the i-th candidate resource are based on meta-learning and attention mechanisms, and are applied to Q. jThe interaction sequence is obtained by encoding the interaction sequence pair consisting of the historical interaction sequence and the i-th candidate resource; the sequence features of the i-th candidate resource are used to characterize: the i-th candidate resource and Q. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M;
[0211] The processing unit 1202 is used to obtain the encoding features of the target historical interaction sequence of the target object to be recommended. The encoding features of the target historical interaction sequence are used to characterize the degree of attention the target object pays to the sequential dependency relationship between multiple historical resources contained in the target historical interaction sequence.
[0212] The processing unit 1202 is further configured to determine the sequence features of the target resources from the candidate resource set, the target resources that match the encoding features of the target historical interaction sequence of the target object, and recommend the target resources to the target object.
[0213] In one implementation, the acquisition unit, when acquiring the set of candidate resources to be recommended, is specifically used for:
[0214] Obtain M candidate resources to be recommended, and collect multiple historical interaction sequences for each candidate resource; any historical interaction sequence is obtained based on the operation of multiple historical resources triggered by the same object in a historical time period.
[0215] Based on multiple historical interaction sequences corresponding to each candidate resource, each candidate resource is encoded to obtain the sequence features of each candidate resource;
[0216] The M candidate resources to be recommended and the sequence features of each candidate resource are combined to form a set of candidate resources to be recommended.
[0217] In one implementation, the processing unit 1202 is used to encode each candidate resource based on multiple historical interaction sequences corresponding to each candidate resource to obtain the sequence features of each candidate resource. Specifically, it is used to:
[0218] According to the filling rules of meta-learning, each candidate resource is filled into multiple corresponding historical interaction sequences, resulting in multiple historical interaction sequence pairs corresponding to each candidate resource; a historical interaction sequence pair includes a historical interaction sequence and a candidate resource.
[0219] Encode multiple historical interaction sequence pairs corresponding to each candidate resource to obtain the sequence features of each candidate resource.
[0220] In one implementation, the processing unit 1202, when encoding multiple historical interaction sequence pairs corresponding to the j-th candidate resource to obtain the sequence features of the j-th candidate resource, specifically performs the following:
[0221] For the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoded features of each historical interaction sequence pair.
[0222] The encoded features of each historical interaction sequence pair are aggregated to obtain the sequence features of the j-th candidate resource.
[0223] In one implementation, the processing unit 1202, when encoding each historical interaction sequence pair among multiple historical interaction sequence pairs of the j-th candidate resource to obtain the encoded features of each historical interaction sequence pair, specifically performs the following:
[0224] For multiple historical interaction sequence pairs of the j-th candidate resource, short-term preference modeling is performed on each historical interaction sequence pair to obtain the short-term encoding features of each historical interaction sequence pair.
[0225] For multiple historical interaction sequence pairs of the j-th candidate resource, long-term preference modeling is performed on each historical interaction sequence pair to obtain the long-term encoding features of each historical interaction sequence pair.
[0226] Based on the attention mechanism, the short-term coding features and corresponding long-term coding features of each historical interaction sequence pair are fused to obtain the coding features of each historical interaction sequence pair.
[0227] In one implementation, the processing unit 1202, when aggregating the encoded features of each historical interaction sequence pair to obtain the sequence features of the j-th candidate resource, specifically performs the following:
[0228] The encoded features of multiple historical interaction sequence pairs corresponding to the j-th candidate resource are subjected to mean pooling to obtain pooled features.
[0229] The pooling feature is used as the sequence feature of the j-th candidate resource.
[0230] In one implementation, when processing unit 1202 obtains the encoded features of the target historical interaction sequence of the target object to be recommended, it is specifically used for:
[0231] Obtain the target historical interaction sequence of the target object to be recommended;
[0232] Short-term preference modeling is performed on the target historical interaction sequence to obtain the short-term encoding features of the target historical interaction sequence; and long-term preference modeling is performed on the target historical interaction sequence to obtain the long-term encoding features of the target historical interaction sequence.
[0233] Based on the attention mechanism, the short-term and long-term coding features of the target's historical interaction sequence are fused to obtain the coding features of the target's historical interaction sequence.
[0234] In one implementation, the processing unit 1202, when determining a target resource whose sequence features match the encoded features of the target object's target historical interaction sequence from the candidate resource set, specifically performs the following:
[0235] The encoded features of the target historical interaction sequence are compared with the sequence features of each candidate resource in the candidate resource set to obtain M similarity results.
[0236] From M similarity calculation results, identify one or more target similarity calculation results that have a similarity calculation result greater than the similarity threshold;
[0237] The sequence features corresponding to the similarity calculation results of each target are determined as target sequence features that match the encoded features of the target's historical interaction sequence;
[0238] Candidate resources corresponding to the features of the target sequence are used as target resources.
[0239] In one implementation, the resource recommendation method is executed by calling a trained sequence model. The training process of the sequence model includes:
[0240] Obtain a set of training tasks, which contains multiple training tasks; each training task contains multiple pairs of training resources, and each pair of training resources contains multiple pairs of training interaction sequences and one pair of test interaction sequences; each pair of training interaction sequences and each pair of test interaction sequences contains a historical interaction sequence and a corresponding training resource.
[0241] Select the Tth training task set i There are 1 training task, where i is a positive integer;
[0242] Call the sequence model on the Tth i Encode the sample pairs of each training resource contained in each training task to obtain the Tth training task. i The sequence feature set and encoded feature set for each training task; the sequence feature set contains the Tth sequence feature set. i The sequence features of each training resource in each training task, the encoded feature set contains the Tth... i Encoded features of each training resource in each training task;
[0243] Based on the sequence feature set, the encoded feature set, and the Tth... iFor each training task, multiple test interaction sequence pairs are used to calculate the loss information; and the model parameters of the sequence model are updated in the direction of decreasing loss information to obtain the updated sequence model.
[0244] Reselect the Tth task from the training task set i+1 The training task is t, and the Tth training task is used. i+1 Each training task iteratively trains the updated sequence model until the sequence model tends to stabilize.
[0245] In one implementation, the Tth i The process of constructing a training task includes:
[0246] For the Tth i Each training task collects multiple training resources and collects K+1 historical interaction sequences for each training resource; the K+1 historical interaction sequences belong to different objects.
[0247] According to the filling rules of meta-learning, each training resource is filled into the corresponding K+1 historical interaction sequences to obtain K+1 historical interaction sequence pairs for each training resource;
[0248] From the K+1 historical interaction sequence pairs of each training resource, select one historical interaction sequence pair as the test interaction sequence pair of the corresponding training resource, and take the K historical interaction sequence pairs other than the test interaction sequence pair from the K+1 historical interaction sequence pairs as the training interaction sequence pairs of the corresponding training resource.
[0249] The K training interaction sequence pairs and one test interaction sequence pair for each training resource constitute the Tth training resource. i One training task.
[0250] In one implementation, the Tth i Any training resource contained in a training task is represented as a target training resource, and the sample pair corresponding to the target training resource is represented as a target sample pair;
[0251] Processing unit 1202, used to call the sequence model to encode the target sample pairs and obtain the sequence features and encoded features of the target training resources, is specifically used for:
[0252] Encode each training interaction sequence pair among multiple training interaction sequence pairs contained in the target sample pair to obtain the encoded features of each training interaction sequence pair; then aggregate the encoded features of each training interaction sequence pair to obtain the sequence features of the target training resource; and,
[0253] The historical interaction sequences, excluding the target training resources, in the test interaction sequence pairs contained in the target sample pair are encoded to obtain the encoded features of the target training resources.
[0254] In one implementation, processing unit 1202 is configured to encode a sequence feature set and a T-th feature set based on the sequence feature set. i When calculating loss information for multiple test interaction sequence pairs corresponding to a training task, they are specifically used for:
[0255] Each encoded feature in the encoded feature set is compared with each sequence feature in the sequence feature set to calculate similarity, resulting in multiple similarity calculation results; and,
[0256] The training resources contained in the test interaction sequence pairs corresponding to each encoded feature in the encoded feature set are compared with the Tth... i Each test interaction sequence corresponding to a training task is compared with the training resources it contains, and multiple comparison results are obtained.
[0257] Based on the loss function, multiple similarity calculation results, and corresponding comparison results, loss information is obtained.
[0258] In one implementation, when the resource recommendation method is applied to a cold start scenario, both training resources and candidate resources are cold start resources; cold start resources refer to resources in the resource recommendation platform whose data volume of resource interaction data is less than the data volume threshold.
[0259] According to one embodiment of this application, Figure 12 The resource recommendation device shown can be constructed by combining each unit into one or more other units, or one or more of the units can be further divided into multiple functionally smaller units. This can achieve the same operation without affecting the technical effect of the embodiments of this application. The above units are based on logical function division. In practical applications, the function of one unit can also be implemented by multiple units, or the function of multiple units can be implemented by one unit. In other embodiments of this application, the resource recommendation device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented by multiple units working together. According to another embodiment of this application, the device can be executed by running on a general-purpose computing device, such as a computer, which includes processing elements and storage elements such as a central processing unit (CPU), random access memory (RAM), and read-only memory (ROM). Figure 3 and Figure 10 The computer program (including program code) for each step involved in the corresponding method shown, to construct such... Figure 12 The resource recommendation apparatus shown herein, and the resource recommendation method for implementing the embodiments of this application, are described. A computer program may be recorded on, for example, a computer-readable recording medium, loaded onto the aforementioned computing device via the computer-readable recording medium, and executed therein.
[0260] In this embodiment, it supports determining sequence features for each candidate resource in the set of candidate resources to be recommended based on meta-learning, attention mechanisms, and historical interaction sequences (containing multiple historical resources that are triggered sequentially within a historical time period). The sequence features of any candidate resource can be used to characterize the sequential dependency relationship between the candidate resource and each historical resource contained in the historical interaction sequence; for example, the triggering of resource 2 (bait) depends on the triggering of resource 1 (hook). By pre-analyzing the triggering sequence dependency relationship between each candidate resource and historical resources, when there is a need for resource distribution (or recommendation) (such as obtaining a resource distribution request initiated by the target object to be recommended), it is possible to quickly match the target resource that meets the target object's interaction preference from the sequence features of M candidate resources based on the encoding features of the target object's target historical interaction sequence (such as the degree of attention (or preference, preference, etc.) of the target interaction object to a certain sequential dependency relationship, i.e., reflecting the target object's preference for resource interaction), where M is a positive integer; this not only improves the accuracy of resource recommendation but also ensures the efficiency of resource recommendation. In summary, this application provides a novel resource recommendation scheme that supports the analysis of sequential dependencies between resources (including candidate resources to be recommended) and, based on the interaction preferences reflected in the target object's target historical interaction sequence, determines the target resource that matches the target object's interaction preferences from multiple candidate resources to be recommended. This enables more accurate delivery of resources of interest to the target object, improving the resource recommendation experience for the target object while also increasing the click-through rate of the target resource.
[0261] Figure 13 A schematic diagram of the structure of a computer device provided in an exemplary embodiment of this application is shown. Please refer to... Figure 13 The computer device includes a processor 1301, a communication interface 1302, and a computer-readable storage medium 1303. The processor 1301, communication interface 1302, and computer-readable storage medium 1303 can be connected via a bus or other means. The communication interface 1302 is used to receive and send data. The computer-readable storage medium 1303 can be stored in the computer device's memory and is used to store computer programs, including program instructions. The processor 1301 is used to execute the program instructions stored in the computer-readable storage medium 1303. The processor 1301 (or CPU (Central Processing Unit)) is the computing and control core of the computer device, suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve corresponding method flows or corresponding functions.
[0262] This application embodiment also provides a computer-readable storage medium (Memory), which is a memory device in a computer device used to store programs and data. It is understood that the computer-readable storage medium here can include both the built-in storage medium in the computer device and extended storage media supported by the computer device. The computer-readable storage medium provides storage space that stores the processing system of the computer device. Furthermore, the storage space also stores one or more instructions suitable for loading and execution by the processor 1301, which may be one or more computer programs (including program code). It should be noted that the computer-readable storage medium here can be high-speed RAM memory or non-volatile memory, such as at least one disk storage device; optionally, it can also be at least one computer-readable storage medium located remotely from the aforementioned processor.
[0263] In one embodiment, the computer-readable storage medium stores one or more instructions; the processor 1301 loads and executes one or more instructions stored in the computer-readable storage medium to implement the corresponding steps in the above-described resource recommendation method embodiment; specifically, the one or more instructions in the computer-readable storage medium are loaded by the processor 1301 and executed as follows:
[0264] Obtain a set of candidate resources to be recommended. The set contains M candidate resources and the sequence features of each candidate resource; the j-th candidate resource corresponds to Q. j Given a sequence of historical interactions, the sequence features of the i-th candidate resource are based on meta-learning and attention mechanisms, and are applied to Q. j The interaction sequence is obtained by encoding the interaction sequence pair consisting of the historical interaction sequence and the i-th candidate resource; the sequence features of the i-th candidate resource are used to characterize: the i-th candidate resource and Q. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M;
[0265] Obtain the encoding features of the target historical interaction sequence of the target object to be recommended. The encoding features of the target historical interaction sequence are used to characterize: the degree of attention the target object pays to the sequential dependencies between multiple historical resources contained in the target historical interaction sequence;
[0266] The sequence features of the candidate resource set are identified, and the target resources that match the encoded features of the target object's target historical interaction sequence are recommended to the target object.
[0267] In one implementation, one or more instructions in the computer-readable storage medium are loaded by the processor 1301 and, when executing to obtain a set of candidate resources to be recommended, specifically perform the following steps:
[0268] Obtain M candidate resources to be recommended, and collect multiple historical interaction sequences for each candidate resource; any historical interaction sequence is obtained based on the operation of multiple historical resources triggered by the same object in a historical time period.
[0269] Based on multiple historical interaction sequences corresponding to each candidate resource, each candidate resource is encoded to obtain the sequence features of each candidate resource;
[0270] The M candidate resources to be recommended and the sequence features of each candidate resource are combined to form a set of candidate resources to be recommended.
[0271] In one implementation, when one or more instructions in the computer-readable storage medium are loaded by the processor 1301 and executed to encode each candidate resource based on multiple historical interaction sequences corresponding to each candidate resource, and to obtain the sequence features of each candidate resource, the following steps are specifically performed:
[0272] According to the filling rules of meta-learning, each candidate resource is filled into multiple corresponding historical interaction sequences, resulting in multiple historical interaction sequence pairs corresponding to each candidate resource; a historical interaction sequence pair includes a historical interaction sequence and a candidate resource.
[0273] Encode multiple historical interaction sequence pairs corresponding to each candidate resource to obtain the sequence features of each candidate resource.
[0274] In one implementation, when one or more instructions in a computer-readable storage medium are loaded by processor 1301 and executed to encode multiple historical interaction sequence pairs corresponding to the j-th candidate resource to obtain the sequence features of the j-th candidate resource, the following steps are specifically performed:
[0275] For the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoded features of each historical interaction sequence pair.
[0276] The encoded features of each historical interaction sequence pair are aggregated to obtain the sequence features of the j-th candidate resource.
[0277] In one implementation, when one or more instructions in a computer-readable storage medium are loaded by processor 1301 and executed, in the process of encoding each historical interaction sequence pair for the j-th candidate resource to obtain the encoded features of each historical interaction sequence pair, the following steps are specifically performed:
[0278] For multiple historical interaction sequence pairs of the j-th candidate resource, short-term preference modeling is performed on each historical interaction sequence pair to obtain the short-term encoding features of each historical interaction sequence pair.
[0279] For multiple historical interaction sequence pairs of the j-th candidate resource, long-term preference modeling is performed on each historical interaction sequence pair to obtain the long-term encoding features of each historical interaction sequence pair.
[0280] Based on the attention mechanism, the short-term coding features and corresponding long-term coding features of each historical interaction sequence pair are fused to obtain the coding features of each historical interaction sequence pair.
[0281] In one implementation, when one or more instructions in the computer-readable storage medium are loaded by the processor 1301 and executed to aggregate the encoded features of each historical interaction sequence pair to obtain the sequence features of the j-th candidate resource, the following steps are specifically performed:
[0282] The encoded features of multiple historical interaction sequence pairs corresponding to the j-th candidate resource are subjected to mean pooling to obtain pooled features.
[0283] The pooling feature is used as the sequence feature of the j-th candidate resource.
[0284] In one implementation, when one or more instructions in the computer-readable storage medium are loaded by the processor 1301 and executed to obtain the encoded features of the target historical interaction sequence of the target object to be recommended, the following steps are specifically performed:
[0285] Obtain the target historical interaction sequence of the target object to be recommended;
[0286] Short-term preference modeling is performed on the target historical interaction sequence to obtain the short-term encoding features of the target historical interaction sequence; and long-term preference modeling is performed on the target historical interaction sequence to obtain the long-term encoding features of the target historical interaction sequence.
[0287] Based on the attention mechanism, the short-term and long-term coding features of the target's historical interaction sequence are fused to obtain the coding features of the target's historical interaction sequence.
[0288] In one implementation, when one or more instructions in a computer-readable storage medium are loaded by processor 1301 and executed to determine a target resource whose sequence characteristics match the encoded characteristics of the target object's target historical interaction sequence from the candidate resource set, the following steps are specifically performed:
[0289] The encoded features of the target historical interaction sequence are compared with the sequence features of each candidate resource in the candidate resource set to obtain M similarity results.
[0290] From M similarity calculation results, identify one or more target similarity calculation results that have a similarity calculation result greater than the similarity threshold;
[0291] The sequence features corresponding to the similarity calculation results of each target are determined as target sequence features that match the encoded features of the target's historical interaction sequence;
[0292] Candidate resources corresponding to the features of the target sequence are used as target resources.
[0293] In one implementation, the resource recommendation method is executed by calling a trained sequence model. The training process of the sequence model includes:
[0294] Obtain a set of training tasks, which contains multiple training tasks; each training task contains multiple pairs of training resources, and each pair of training resources contains multiple pairs of training interaction sequences and one pair of test interaction sequences; each pair of training interaction sequences and each pair of test interaction sequences contains a historical interaction sequence and a corresponding training resource.
[0295] Select the Tth training task set i There are 1 training task, where i is a positive integer;
[0296] Call the sequence model on the Tth i Encode the sample pairs of each training resource contained in each training task to obtain the Tth training task. i The sequence feature set and encoded feature set for each training task; the sequence feature set contains the Tth sequence feature set. i The sequence features of each training resource in each training task, the encoded feature set contains the Tth... i Encoded features of each training resource in each training task;
[0297] Based on the sequence feature set, the encoded feature set, and the Tth... i For each training task, multiple test interaction sequence pairs are used to calculate the loss information; and the model parameters of the sequence model are updated in the direction of decreasing loss information to obtain the updated sequence model.
[0298] Reselect the Tth task from the training task set i+1 The training task is t, and the Tth training task is used. i+1 Each training task iteratively trains the updated sequence model until the sequence model tends to stabilize.
[0299] In one implementation, the Tth i The process of constructing a training task includes:
[0300] For the Tth iEach training task collects multiple training resources and collects K+1 historical interaction sequences for each training resource; the K+1 historical interaction sequences belong to different objects.
[0301] According to the filling rules of meta-learning, each training resource is filled into the corresponding K+1 historical interaction sequences to obtain K+1 historical interaction sequence pairs for each training resource;
[0302] From the K+1 historical interaction sequence pairs of each training resource, select one historical interaction sequence pair as the test interaction sequence pair of the corresponding training resource, and take the K historical interaction sequence pairs other than the test interaction sequence pair from the K+1 historical interaction sequence pairs as the training interaction sequence pairs of the corresponding training resource.
[0303] The K training interaction sequence pairs and one test interaction sequence pair for each training resource constitute the Tth training resource. i One training task.
[0304] In one implementation, the Tth i Any training resource contained in a training task is represented as a target training resource, and the sample pair corresponding to the target training resource is represented as a target sample pair;
[0305] When one or more instructions in a computer-readable storage medium are loaded by processor 1301 and executed to encode the target sample pair using a sequence model to obtain the sequence features and encoded features of the target training resource, the following steps are specifically performed:
[0306] Encode each training interaction sequence pair among multiple training interaction sequence pairs contained in the target sample pair to obtain the encoded features of each training interaction sequence pair; then aggregate the encoded features of each training interaction sequence pair to obtain the sequence features of the target training resource; and,
[0307] The historical interaction sequences, excluding the target training resources, in the test interaction sequence pairs contained in the target sample pair are encoded to obtain the encoded features of the target training resources.
[0308] In one implementation, one or more instructions in a computer-readable storage medium are loaded and executed by processor 1301 based on a sequence feature set, an encoded feature set, and the Tth... i When calculating loss information for multiple test interaction sequence pairs corresponding to a training task, the following steps are specifically performed:
[0309] Each encoded feature in the encoded feature set is compared with each sequence feature in the sequence feature set to calculate similarity, resulting in multiple similarity calculation results; and,
[0310] The training resources contained in the test interaction sequence pairs corresponding to each encoded feature in the encoded feature set are compared with the Tth... i Each test interaction sequence corresponding to a training task is compared with the training resources it contains, and multiple comparison results are obtained.
[0311] Based on the loss function, multiple similarity calculation results, and corresponding comparison results, loss information is obtained.
[0312] In one implementation, when the resource recommendation method is applied to a cold start scenario, both training resources and candidate resources are cold start resources; cold start resources refer to resources in the resource recommendation platform whose data volume of resource interaction data is less than the data volume threshold.
[0313] Based on the same inventive concept, the principle and beneficial effects of the computer device provided in the embodiments of this application in solving the problem are similar to the principle and beneficial effects of the resource recommendation method in the embodiments of this application in solving the problem. Please refer to the principle and beneficial effects of the implementation of the method. For the sake of brevity, they will not be repeated here.
[0314] This application also provides a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the resource recommendation method described above.
[0315] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed in this application can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0316] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in or transmitted through a computer-readable storage medium. The computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk (SSD)).
[0317] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this invention should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A resource recommendation method, characterized in that, include: Obtain a set of candidate resources to be recommended, the set of candidate resources containing M candidate resources and the sequence features of each candidate resource; The j-th candidate resource corresponds to Q. j A historical interaction sequence, wherein the sequence features of the j-th candidate resource are based on meta-learning and attention mechanisms, for the Q... j The interaction sequence is obtained by encoding the historical interaction sequence and the interaction sequence of the j-th candidate resource; the sequence feature of the j-th candidate resource is used to characterize: the interaction sequence of the j-th candidate resource and the Q-value. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M; Obtain the target historical interaction sequence of the target object to be recommended, and perform short-term preference modeling on the target historical interaction sequence to obtain the short-term encoding features of the target historical interaction sequence; Furthermore, long-term preference modeling is performed on the target historical interaction sequence to obtain the long-term encoding features of the target historical interaction sequence; Based on the attention mechanism, the short-term and long-term coding features of the target historical interaction sequence are fused to obtain the coding features of the target historical interaction sequence. The encoding features of the target historical interaction sequence are used to characterize the degree of attention the target object pays to the sequential dependencies between multiple historical resources contained in the target historical interaction sequence; The sequence features of the candidate resource set are determined, and the target resources that match the encoding features of the target historical interaction sequence of the target object are recommended to the target object.
2. The method of claim 1, wherein, The process of obtaining the set of candidate resources to be recommended includes: Obtain M candidate resources to be recommended, and collect multiple historical interaction sequences for each candidate resource; any historical interaction sequence is obtained based on the operation of multiple historical resources triggered by the same object in a historical time period. Based on multiple historical interaction sequences corresponding to each candidate resource, each candidate resource is encoded to obtain the sequence features of each candidate resource; The M candidate resources to be recommended and the sequence features of each candidate resource are combined to form a set of candidate resources to be recommended.
3. The method of claim 2, wherein, The step of encoding each candidate resource based on multiple historical interaction sequences corresponding to each candidate resource to obtain the sequence features of each candidate resource includes: According to the filling rules of the meta-learning, each candidate resource is filled into the corresponding multiple historical interaction sequences to obtain multiple historical interaction sequence pairs corresponding to each candidate resource; a historical interaction sequence pair includes a historical interaction sequence and a candidate resource. Encode multiple historical interaction sequence pairs corresponding to each candidate resource to obtain the sequence features of each candidate resource.
4. The method of claim 3, wherein, Encoding multiple historical interaction sequence pairs corresponding to the j-th candidate resource yields the sequence features of the j-th candidate resource, including: For the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoded features of each historical interaction sequence pair. The encoded features of each historical interaction sequence pair are aggregated to obtain the sequence features of the j-th candidate resource.
5. The method of claim 4, wherein, In the multiple historical interaction sequence pairs for the j-th candidate resource, each historical interaction sequence pair is encoded to obtain the encoded features of each historical interaction sequence pair, including: For the j-th candidate resource, short-term preference modeling is performed on each historical interaction sequence pair to obtain the short-term encoding features of each historical interaction sequence pair. For the j-th candidate resource, long-term preference modeling is performed on each historical interaction sequence pair to obtain the long-term encoding features of each historical interaction sequence pair. Based on the attention mechanism, the short-term coding features and corresponding long-term coding features of each historical interaction sequence pair are fused to obtain the coding features of each historical interaction sequence pair.
6. The method of claim 4, wherein, The step of aggregating the encoded features of each historical interaction sequence pair to obtain the sequence features of the j-th candidate resource includes: The encoded features of multiple historical interaction sequence pairs corresponding to the j-th candidate resource are subjected to mean pooling to obtain pooled features; The pooling feature is used as the sequence feature of the j-th candidate resource.
7. The method of claim 1, wherein, The target resources determined from the candidate resource set that have sequence features matching the encoded features of the target object's target historical interaction sequence include: The encoded features of the target historical interaction sequence are compared with the sequence features of each candidate resource in the candidate resource set to obtain M similarity calculation results. From the M similarity calculation results, determine one or more target similarity calculation results that have a similarity calculation result greater than the similarity threshold; The sequence features corresponding to each target similarity calculation result are determined as target sequence features that match the encoded features of the target historical interaction sequence; The candidate resources corresponding to the target sequence features are used as the target resources.
8. The method according to any one of claims 1 to 7, wherein The method is executed by calling a trained sequence model, the training process of which includes: Obtain a training task set, which contains multiple training tasks; each training task contains multiple training resource sample pairs, each training resource sample pair contains multiple training interaction sequence pairs and one test interaction sequence pair; each training interaction sequence pair and test interaction sequence pair contains a historical interaction sequence and a corresponding training resource. Select the Tth task from the training task set. i There are 1 training task, where i is a positive integer; Call the sequence model for the Tth i Encode the sample pairs of each training resource included in each training task to obtain the Tth training task. i The sequence feature set and encoded feature set of the training task; the sequence feature set includes the Tth training task. i The sequence features of each training resource in each training task, the encoded feature set containing the Tth... i Encoded features of each training resource in each training task; Based on the sequence feature set, the encoded feature set and the Tth... i For each training task, multiple test interaction sequence pairs are used to calculate loss information; and the model parameters of the sequence model are updated in the direction of decreasing loss information to obtain the updated sequence model. Reselect the Tth task from the set of training tasks i+1 The training task, and using the Tth training task. i+1 Each training task iteratively trains the updated sequence model until the sequence model tends to stabilize.
9. The method of claim 8, wherein, The first T i The construction process of the first training task includes: For the Tth i Each training task collects multiple training resources, and collects K+1 historical interaction sequences for each training resource; the K+1 historical interaction sequences belong to different objects; According to the filling rules of the meta-learning, each training resource is filled into the corresponding K+1 historical interaction sequences to obtain K+1 historical interaction sequence pairs for each training resource; From the K+1 historical interaction sequence pairs of each training resource, select one historical interaction sequence pair as the test interaction sequence pair of the corresponding training resource, and take the K historical interaction sequence pairs other than the test interaction sequence pair from the K+1 historical interaction sequence pairs as the training interaction sequence pairs of the corresponding training resource. K training interaction sequence pairs and one test interaction sequence pair for each training resource, make up the T i training task.
10. The method of claim 8, wherein, The Tth i Any training resource included in a training task is represented as a target training resource, and the sample pair corresponding to the target training resource is represented as a target sample pair; The process of encoding the target sample pair using a sequence model to obtain the sequence features and encoded features of the target training resource includes: Encode each training interaction sequence pair among the multiple training interaction sequence pairs contained in the target sample pair to obtain the encoding features of each training interaction sequence pair; and aggregate the encoding features of each training interaction sequence pair to obtain the sequence features of the target training resource. as well as, The historical interaction sequences, excluding the target training resource, in the test interaction sequence pairs contained in the target sample pair are encoded to obtain the encoded features of the target training resource.
11. The method of claim 8, wherein, The loss information is calculated based on the sequence feature set, the encoding feature set, and the plurality of test interaction sequence pairs corresponding to the T i training tasks. Each encoded feature in the encoded feature set is compared with each sequence feature in the sequence feature set to obtain multiple similarity calculation results; and, The training resources contained in the test interaction sequence pairs corresponding to each encoded feature in the encoded feature set are compared with the Tth... i Each test interaction sequence corresponding to a training task is compared with the training resources it contains, and multiple comparison results are obtained. Based on the loss function, the multiple similarity calculation results and the corresponding comparison results are used to obtain loss information.
12. The method of claim 1, wherein, When the method is applied to a cold start scenario, both training resources and candidate resources are cold start resources; cold start resources refer to resources in the resource recommendation platform whose data volume of resource interaction data is less than the data volume threshold.
13. A resource recommendation apparatus, characterized by comprising: include: The acquisition unit is used to acquire a set of candidate resources to be recommended, wherein the set of candidate resources includes M candidate resources and the sequence features of each candidate resource; The j-th candidate resource corresponds to Q. j A historical interaction sequence, wherein the sequence features of the j-th candidate resource are based on meta-learning and attention mechanisms, for the Q... j The interaction sequence is obtained by encoding the historical interaction sequence and the interaction sequence of the j-th candidate resource; the sequence feature of the j-th candidate resource is used to characterize: the interaction sequence of the j-th candidate resource and the Q-value. j The sequential dependencies of the historical resources contained in a given historical interaction sequence, triggered in that order; M, j, and Q j All are positive integers, and j≤M; The processing unit is configured to acquire the target historical interaction sequence of the target object to be recommended, and to perform short-term preference modeling on the target historical interaction sequence to obtain the short-term encoding features of the target historical interaction sequence; and to perform long-term preference modeling on the target historical interaction sequence to obtain the long-term encoding features of the target historical interaction sequence. The processing unit is further configured to fuse the short-term and long-term coding features of the target historical interaction sequence based on the attention mechanism to obtain the coding features of the target historical interaction sequence. The encoding features of the target historical interaction sequence are used to characterize the degree of attention the target object pays to the sequential dependencies between multiple historical resources contained in the target historical interaction sequence; The processing unit is further configured to determine, from the candidate resource set, target resources whose sequence features match the encoding features of the target historical interaction sequence of the target object, and recommend the target resources to the target object.
14. A computer device, comprising: include: A processor, adapted to execute computer programs; A computer-readable storage medium storing a computer program that, when executed by the processor, implements the resource recommendation method as described in any one of claims 1-12.
15. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program adapted to be loaded by a processor and executed as described in any one of claims 1-12.
16. A computer program product, characterised in that, The computer program product includes computer instructions that, when executed by a processor, implement the resource recommendation method as described in any one of claims 1-12.