A long user behavior-based serialized recommendation method and device and storage medium

By improving the product content text modeling and pre-trained language model, and combining attention sparsity and pruning techniques to optimize the vector space, the cold start and popularity bias problems of the sequential recommendation system are solved, and accurate product recommendation is achieved in both high-frequency and low-frequency scenarios.

CN117390074BActive Publication Date: 2026-06-23NORTHEASTERN UNIV CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NORTHEASTERN UNIV CHINA
Filing Date
2023-08-18
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing sequential recommendation systems have shortcomings in cold start and popularity bias problems, making it difficult to make accurate product recommendations in both high-frequency and low-frequency scenarios.

Method used

We adopt a text modeling approach based on product content, utilize a pre-trained language model, optimize the vector space by improving the encoder and decoder structure, and combine attention sparsification and pruning techniques. We then use contrastive learning to train the language model to improve vector representation.

Benefits of technology

It can effectively predict products in both high-frequency and low-frequency scenarios, significantly improve the cold start and popularity bias problems, and enhance recommendation accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117390074B_ABST
    Figure CN117390074B_ABST
Patent Text Reader

Abstract

The application belongs to the field of sequential recommendation system, and proposes a sequential recommendation method, device and storage medium based on long user behavior. The existing language model is improved; the user interaction history sequence is divided into multiple user interaction history subsequences, the next time real interaction commodity corresponding to the user interaction history is input into the encoder and decoder, and the corresponding vector representation is obtained; the same vector representation of the negative sample commodity is selected to form a vector space, the correlation of the user interaction history with the positive sample commodity and the negative sample commodity is calculated, the loss value is obtained, the language model parameters are trained by contrast learning according to the cross entropy loss function, and finally the trained language model is obtained, which is used for predicting the next recommended commodity. The sequential recommendation method proposed in the application achieves the most advanced effect in commodity recommendation, and can effectively predict in both high-frequency and low-frequency scenarios.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of serialized recommendation systems, and more particularly to a serialized recommendation method, apparatus, and storage medium based on long user behavior. Background Technology

[0002] Serializable recommendation systems dynamically recommend the next possible product to a user based on their historical behavior, playing a crucial role in many web applications such as e-commerce shopping and short video recommendations. These systems learn the dependencies between products in a user's interaction history by modeling user behavior using different neural network architectures. Existing recommendation systems typically represent products using identifiers, employing vector representations of randomly initialized product identifiers as the product's vector representation, and optimizing these vector representations using user-product interaction signals. However, real-world products are long-tailed, leading to a starting problem in existing methods and a tendency to recommend popular products. Product information can provide textual matching signals of user-product relevance, helping to alleviate the cold start problem and popularity bias. Therefore, modeling products based on product content is crucial for building more comprehensive recommendation systems.

[0003] In the presentation "Wang-Cheng Kang, and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. In 2018 IEEE International Conference on Data Mining (ICDM)," SASRec is a model based on a self-attention mechanism to learn the dependencies between items in a user's interaction history. Prior to this, Markov chain-based methods performed well in sparse scenarios but couldn't learn longer-term user behaviors. RNN modeling could solve this problem but required a large amount of training data. SASRec addresses these limitations by using a self-attention mechanism, attempting to identify which items are more relevant from the user's history by assigning weights, and using these weights to predict the next item. The model consists of an embedding layer, a self-attention layer, and a prediction layer. The embedding layer models items as randomly initialized vector representations corresponding to identifiers, using position vectors as their order representation in user behavior. The self-attention layer adaptively assigns weights to items in the user's interaction history. Finally, the learned user behavior representations are used to predict the item for the next interaction. A drawback of SASRec is that the randomly initialized vector representations of items depend on the sufficiency of training data; this modeling approach will fail for new users and less popular items.

[0004] In the paper "Yueqi Xie, Peilin Zhou, and Sunghun Kim. Decoupled side information fusion for sequential recommendation. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022," DIF-SR considers enriching product representations with auxiliary information rather than using product identifiers as the sole representation of products. Effectively integrating auxiliary information into the recommendation process is a challenging problem. DIF-SR shifts the fusion process from the embedding layer to the self-attention layer, independently calculating attention and decoupling self-attention computation by fusing attention matrices. However, DIF-SR's drawback lies in the lack of direct interaction between auxiliary information and the products themselves during modeling. Furthermore, identifiers and auxiliary information are still based on randomly initialized vectors, making accurate recommendations difficult in cold-start scenarios.

[0005] In the paper "Zheng Yuan, Fajie Yuan, Yu Song, Youhua Li, Junchen Fu, Fei Yang, Yunzhu Pan, and Yongxin Ni. 2023. Where to Go Next for Recommender Systems ID-vs. Modality-Based Recommender Models Revisited", MoRec further explored the potential of using text and image modalities as product representations to replace product identifiers. It uses product title text or product images as product identifiers, encoding product representation vectors using pre-trained language models or visual encoding models. This modeling approach significantly improves recommendation accuracy for low-frequency products by leveraging the internal knowledge of pre-trained models. However, MoRec's accuracy is lower than the identifier-based modeling approach in high-frequency product recommendation scenarios, failing to propose a model effective in all scenarios. Secondly, identifiers and other modal representations can be complementary; however, MoRec's combination in this regard failed, only exploring the feasibility of using other modalities to represent products to a certain extent.

[0006] The technical problem to be solved by this invention is: how to design a new sequential recommendation system that can effectively improve the cold start problem and the popularity bias problem, learn reliable vector representations of products, and achieve high accuracy in product recommendations in both high-frequency and low-frequency scenarios. Summary of the Invention

[0007] The purpose of this invention is to propose a sequential recommendation method based on long user behavior, which can improve the cold start problem and popularity bias problem, and make accurate product recommendations in both high-frequency and low-frequency scenarios. To learn reliable user interaction history and product vector representations even in low-frequency scenarios, the method models user interaction history and products based on product content text, leveraging the potential capabilities of current pre-trained language models to improve the insufficient training of vector representations in low-frequency scenarios. Simultaneously, to encode long user behavior sequences, the language model's encoder is improved by employing attention sparsity and pruning on the self-attention method. The language model is trained through contrastive learning to optimize the vector space, giving it user preference awareness capabilities, making it more suitable for sequential recommendation tasks.

[0008] The technical solution of this invention is as follows: A serialized recommendation method based on long user behavior, which improves upon existing language models to obtain a new language model; the new language model consists of an encoder and a decoder; both the encoder and decoder are based on improvements to the standard Transformer structure, and their initial parameters are initialized using the weights of a pre-trained existing language model; the encoder employs attention sparsity and pruning for its self-attention mechanism; the encoder independently encodes different segments of input values ​​during the encoding process to obtain segment encoding vectors; the decoder redistributes weights and fuses the segment encoding vectors during the decoding process;

[0009] After dividing the user interaction history sequence into multiple user interaction history subsequences, the subsequences are input into the encoder to obtain the encoded vector representation of the user interaction history. After decoding, the vector representation of the user interaction history is obtained. The actual interactive product corresponding to the user interaction history at the next moment is processed by the encoder and decoder to obtain the product vector representation.

[0010] Negative sample items are selected, and their vector representations, along with the user interaction history encoding vector representation and the actual interaction item vector representation, form a vector space. The correlation between the user interaction history and the positive and negative sample items is calculated. Based on the correlation, the loss value between the predicted item and the actual interaction item of the language model is obtained. The parameters of the language model are trained by comparing and learning using the cross-entropy loss function, optimizing the vector space of the language model, bringing the user interaction history and the positive sample item vector representation in the vector space closer together, and separating the user interaction history and the negative sample item vector representation. Finally, the trained language model is obtained and used to predict the next recommended item.

[0011] The user interaction history and the corresponding next-moment real interaction product are obtained as follows: user consumption records are collected as user product interaction records. Each user product interaction record includes a user and product interaction sequence. Each product in the product interaction sequence includes product attributes and the time when the user interacted with the product. The user product interaction records are sorted according to the user's interaction time. The user interaction products at the first t-1 moments are used as the user interaction history, and the product at moment t is used as the corresponding next-moment real interaction product. The negative sample products are negative samples within the batch or randomly sampled products that the user has not interacted with.

[0012] The encoder's input values ​​are product text and user interaction history text;

[0013] The product text is a textual representation of the product based on its attributes; for each product, the following text representation is used:

[0014] X(v)=id:v(id),name:v(name),…, <att> k :v( <att> k )

[0015] Where v represents the product, v(id) is the product serial number, v(name) is the product name, and v( <att>i) represents the product attribute; i = [1, k];

[0016] The user interaction history sequence text is based on the textual representation of the user interaction history sequence; user interaction history {v1,v2,…,v t-1 The process involves textualizing each product in the user interaction history sequence and then concatenating the data. Get user interaction history text

[0017] The process of obtaining the product vector representation is as follows:

[0018] Product vector representation based on prompting learning; product text representation is encoded into product vector representation h by an encoder. v :

[0019]

[0020] Encoder is the encoder, and Decoder is the decoder. It is the vector representation of the first word input to the decoder.

[0021] The process of obtaining the user interaction history encoded vector representation is as follows:

[0022] The user interaction history sequence is divided into n subsequences based on time periods: {{v1,v2,…,v...}. m },…,{v t-m-1 ,…,v t-1 Each user interaction history subsequence reflects user preferences over a period of time: the user interaction history text is transformed into a text representation of each user interaction history subsequence:

[0023] Each textualized user interaction history subsequence Independently encoded as a vector representation: The user interaction history encoding vector is represented as follows:

[0024] The process of obtaining the vector representation of the user interaction history is as follows: the encoded vector representation of the user interaction history is input to the decoder. The decoder learns to recalculate the self-attention score and assign weights to each user interaction history subsequence through a cross-attention mechanism. Finally, each user interaction subsequence is weighted to obtain the vector representation of the user interaction history.

[0025]

[0026] The relevance calculation is as follows: the dot product of the user interaction history vector representation and the positive and negative example product vector representations is calculated, and the final relevance score is obtained through a normalized exponential function:

[0027] The loss value is calculated as follows: the cross-entropy loss function is used to calculate the predicted item and the actual user interaction item v at time t. * Loss value between:

[0028] A serialized recommendation device based on long user behavior includes:

[0029] The memory is used for new language models, storing interaction history sequences and their corresponding real-time interactive products;

[0030] The new language model, based on an improvement of an existing language model, includes an encoder, a decoder, and a loss function. Both the encoder and decoder are based on improvements to the standard Transformer structure, and their initial parameters are initialized using the weights of a pre-trained existing language model. The encoder employs attention sparsity and pruning for its self-attention mechanism. During the encoding process, the encoder independently encodes the input value into segments to obtain segment encoding vectors. During the decoding process, the decoder redistributes weights to each segment encoding vector and fuses them.

[0031] A processor is configured to execute a computer program stored in the memory, wherein, when the computer program is executed, the processor is configured to:

[0032] After dividing the user interaction history sequence into multiple user interaction history subsequences, the subsequences are input into the encoder to obtain the encoded vector representation of the user interaction history. After decoding, the vector representation of the user interaction history is obtained. The actual interactive product corresponding to the user interaction history at the next moment is processed by the encoder and decoder to obtain the product vector representation.

[0033] The vector space is composed of the user interaction history encoded vector representation, the real interaction product vector representation, and the negative sample product vector representation. The correlation between the user interaction history and the positive sample products and the negative sample products is calculated respectively. Based on the correlation, the loss value between the predicted product of the language model and the real interaction product is obtained. The parameters of the language model are trained by comparing and learning using the cross-entropy loss function, optimizing the vector space of the language model, bringing the user interaction history and the positive sample product vector representation in the vector space closer, separating the user interaction history and the negative sample product vector representation, and finally obtaining the trained language model, which is used to predict the next recommended product.

[0034] A storage medium for storing a computer program that can be executed by a processor to implement a serialization recommendation method.

[0035] The beneficial effects of this invention: The sequential recommendation method proposed in this invention achieves state-of-the-art results in product recommendation, effectively predicting in both high-frequency and low-frequency scenarios. Experimental results show that our model performs exceptionally well on multiple real-world recommendation datasets, including Yelp, Amazon Beauty, Amazon Sports, and Amazon Toys, surpassing existing recommendation models. Furthermore, our model significantly improves upon other modeling approaches by addressing the cold start and popularity bias problems. Attached Figure Description

[0036] Figure 1(a) is a schematic diagram of user interaction history and actual interaction product data;

[0037] Figure 1(b) shows the recommendation results of a recommendation system based on product identifier modeling (T5-ID);

[0038] Figure 1(c) shows the recommended results of the method of the present invention.

[0039] Figure 2 This is a schematic diagram of a serialized recommendation method based on long user behavior.

[0040] Figure 3 This is a schematic diagram of a serialized recommendation device based on long user behavior. Detailed Implementation

[0041] The invention will be further described with reference to the accompanying drawings.

[0042] This invention proposes a text-matching approach to model sequential recommendation tasks, textualizing user behavior with product usage templates. This simplifies and efficiently integrates identifiers and auxiliary information. Furthermore, it introduces an attention sparsity method to overcome the limitation of the maximum encoding length of pre-trained language models, enabling the modeling of longer user interaction histories to meet real-world needs. Leveraging the capabilities of pre-trained language models, better vector representations of low-frequency products can be learned, effectively mitigating the cold start and popularity bias problems.

[0043] Figure 1 illustrates a common bias problem in current recommendation systems. Mainstream recommendation models are based on randomly initialized product identifier vectors. This modeling approach relies on sufficient training data, but when data is insufficient, it cannot learn useful product representations and cannot accurately recommend less popular products. As shown in Figure 1, product id 11322 is a less popular product, and a product identifier-based recommendation system model cannot accurately recommend it. This invention, however, uses textual modeling of products and user behavior, leveraging the internal knowledge of a pre-trained language model to learn better low-frequency product vector representations, effectively alleviating this problem and making accurate recommendations.

[0044] Combination Figure 2 A sequential recommendation method based on long user behavior includes improvements to existing language models, collection of user interaction history and product data, textual representation of user interaction history and products, and learning of vector representations of products and user interaction history.

[0045] The process of establishing a language model structure, comparing and learning the model parameters, and optimizing the language model's vector space includes:

[0046] Step 1.1: The language model initializes parameters based on the existing pre-trained language model. The reason for sampling the pre-trained language model is that the knowledge learned by the language model during the pre-training process can be utilized. The language model can learn the vector representation of the product based solely on the product content when the training data is insufficient. In low-frequency scenarios, this modeling method is far better than the method of randomly initializing vectors based on product identifiers.

[0047] Step 1.2: The language model is based on an encoder-decoder architecture. The purpose is to decouple the vector modeling process of long user behavior. At the encoder end, the interaction history of each sub-user is independently encoded. At the decoder end, the weights of each encoded vector are redistributed to achieve fusion and obtain the final representation of the user interaction history.

[0048] Step 1.3: Improve the encoder's self-attention calculation method. During the encoding process, different segments of the input value are independently encoded to obtain the encoding vectors for each segment. This achieves sparsity and pruning of attention, breaking the length limit of language model encoding. Attention pruning reduces the computational memory resource overhead and reduces the time complexity from O(n^2) to O(n^2). 2 The time complexity has been reduced to O(n).

[0049] Step 1.4: The language model encoding vector space is represented by the user interaction history encoding vectors. Real-world interactive product vector representation and negative sample product vector representation Together, they form the basis for calculating the correlation scores between user interaction history and positive and negative sample products. The loss value between the predicted items and the actual interacted items is obtained based on the correlation: The language model parameters are compared and trained using the cross-entropy loss function to optimize the vector space of the language model. This process brings the user interaction history and the positive sample product vector representation in the vector space closer together, and then separates the user interaction history from the negative sample product vector representation. Finally, the trained language model is obtained and used to predict the next recommended product.

[0050] Data acquisition for training the language model involves collecting user interaction records with products to construct the necessary training data. The specific process includes:

[0051] Step 2.1: Collect user consumption records as user product interaction records. Each user product interaction record includes a sequence of user and product interactions. Each product in the product interaction sequence includes product attributes and the time when the user interacted with the product.

[0052] Step 2.2: Sort the user product interaction records according to the time of user interaction with the products, and use the sequence of user interaction products at the time t-1 as the user interaction history;

[0053] Step 2.3: The product that the user interacts with at time t is taken as the real product that the user interacts with at the next time. During the training process, it is regarded as a positive sample. Among the remaining products that the user has not interacted with, negative samples are selected by two negative sampling methods: in-batch negative sampling and random negative sampling. These negative samples are used for comparative learning in step 1.4 to optimize the vector space of the language model.

[0054] Based on the acquired data, product representation learning and user interaction history representation learning are obtained through language modeling.

[0055] The process of textualizing products based on their attributes and learning product vector representations specifically includes the following steps:

[0056] Step 3.1: Using product identifiers and their attributes, represent the product in textual form. Based on cue learning, provide a language description for each attribute to help the language model better understand the product and the meaning of its attributes.

[0057] X(v)=id:v(id),name:v(name),…, <att> k :v( <att> k )

[0058] Where v represents the product, v(id) is the product serial number, v(name) is the product name, and v( <att>i) represents the product attribute; i = [1, k];

[0059] Step 3.2: Product Vector Representation; The product text representation is encoded into a product vector representation h by the encoder. v :

[0060]

[0061] Encoder is the encoder, and Decoder is the decoder. This is the vector representation of the first word input to the decoder. Unlike methods that use randomly initialized product identifier vector representations, product vector representations learned from product attribute text can contain rich product information even in low-frequency scenarios, allowing the model to make relevant recommendations.

[0062] User interaction history representation learning; the user interaction history is textualized, and the user interaction history sequence is divided into time periods. Each sub-segment is independently encoded, and weights are redistributed and fused during the decoding process. The specific process includes the following:

[0063] Step 4.1: Textual representation of user-product interaction sequence; user interaction history {v1, v2, ..., v t-1 }, Based on the product text obtained in step 3.1, the following steps are performed: Get user interaction history text Adding natural language instructions before encoding the input language model stimulates the language model's ability to understand recommendation tasks and user interaction history;

[0064] Step 4.2: Divide the user interaction history sequence based on time periods; divide the user interaction history sequence into n user interaction history subsequences {{v1,v2,…,v...} according to time periods. m },…,{v t-m-1 ,…,v t-1 Each user interaction history subsequence reflects user preferences over a period of time: the user interaction history text is transformed into a text representation of each user interaction history subsequence: The advantage of segmenting user interaction history is that user behavior is time-related, and user interaction behavior in a certain period of time has short-term instantaneous characteristics. Therefore, segmented independent encoding not only reduces resource consumption, but also reduces the interference of interaction behavior with a large time span on the current recommendation decision to a certain extent.

[0065] Each textualized user interaction history subsequence Independently encoded as a vector representation: The user interaction history encoding vector is represented as follows:

[0066] Step 4.3: The encoded vector representation of user interaction history is input to the decoder. The decoder learns to recalculate the self-attention score and assign weights to each user interaction history subsequence through a cross-attention mechanism. Finally, each user interaction subsequence is weighted. The advantage of this reweighting is that the modeling process takes into account both long-term user preferences and focuses on learning recent short-term preferences, ultimately obtaining a vector representation of the user interaction history.

[0067]

[0068] Table 1 compares the performance of this invention with existing sequential recommendation system models. Recall is the recall rate, and NDCG is the normalized depreciation cumulative gain. Both are metrics used to measure the ranking performance of recommended items. Table 1 shows that compared to models based on item identifiers, such as Bert4Rec, this invention achieves almost twice the performance, and also surpasses the performance of the previous best model, DIF-SR. This invention achieves the best recommendation performance to date, exceeding previous mainstream recommendation models.

[0069] Table 1 Comparison of Model Performance

[0070]

[0071]

[0072] This application provides a serialized recommendation device based on long user behavior, such as... Figure 3 As shown, it includes:

[0073] The memory is used for new language models, storing interaction history sequences and their corresponding real-time interactive products;

[0074] The new language model, based on an improvement of an existing language model, includes an encoder, a decoder, and a loss function. Both the encoder and decoder are based on improvements to the standard Transformer structure, and their initial parameters are initialized using the weights of a pre-trained existing language model. The encoder employs attention sparsity and pruning for its self-attention mechanism. During the encoding process, the encoder independently encodes the input value into segments to obtain segment encoding vectors. During the decoding process, the decoder redistributes weights to each segment encoding vector and fuses them.

[0075] A processor is configured to execute a computer program stored in the memory, wherein, when the computer program is executed, the processor is configured to:

[0076] After dividing the user interaction history sequence into multiple user interaction history subsequences, the subsequences are input into the encoder to obtain the encoded vector representation of the user interaction history. After decoding, the vector representation of the user interaction history is obtained. The actual interactive product corresponding to the user interaction history at the next moment is processed by the encoder and decoder to obtain the product vector representation.

[0077] The vector space is composed of the user interaction history encoded vector representation, the real interaction product vector representation, and the negative sample product vector representation. The correlation between the user interaction history and the positive sample products and the negative sample products is calculated respectively. Based on the correlation, the loss value between the predicted product of the language model and the real interaction product is obtained. The parameters of the language model are trained by comparing and learning using the cross-entropy loss function, optimizing the vector space of the language model, bringing the user interaction history and the positive sample product vector representation in the vector space closer, separating the user interaction history and the negative sample product vector representation, and finally obtaining the trained language model, which is used to predict the next recommended product.

[0078] A computer-readable storage medium stores a computer program that can be executed by a processor to implement a serialization recommendation method. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media.< / att> < / att> < / att> < / att> < / att> < / att>

Claims

1. A serialized recommendation method based on long user behavior, characterized in that, A new language model is obtained by improving upon existing language models. The new language model includes an encoder, a decoder, and a loss function. Both the encoder and decoder are based on improvements to the standard Transformer structure, and their initial parameters are initialized using the weights of a pre-trained existing language model. The encoder employs attention sparsity and pruning for its self-attention mechanism. During the encoding process, the encoder independently encodes different segments of the input value to obtain segment encoding vectors. During the decoding process, the decoder redistributes weights to each segment encoding vector and fuses them. After dividing the user interaction history sequence into multiple user interaction history subsequences, the subsequences are input into the encoder to obtain the encoded vector representation of the user interaction history, and then decoded by the decoder to obtain the vector representation of the user interaction history. The actual interactive product corresponding to the user's interaction history at the next moment is encoded and decoded to obtain a product vector representation; The encoder's input values ​​are product text and user interaction history text; The product text is a textual representation of the product based on its attributes; for each product, the following text representation is used: Where v represents the product, v(id) is the product serial number, v(name) is the product name, and v( <att> i ) represents the product attribute; i = [1, k];< / att> The user interaction history text is a textual representation based on the sequence of user interaction history; user interaction history The text representation of each product in the user interaction history sequence is then concatenated: Obtain user interaction history text ; The process of obtaining the user interaction history encoded vector representation is as follows: The user interaction history sequence is divided into n subsequences based on time periods. Each user interaction history subsequence reflects user preferences over a period of time: the user interaction history text is transformed into a text representation of each user interaction history subsequence. ; Each textualized user interaction history subsequence Independently encoded as a vector representation: The user interaction history encoding vector is represented as: ; The process of obtaining the vector representation of the user interaction history is as follows: the encoded vector representation of the user interaction history is input to the decoder. The decoder learns to recalculate the self-attention score and assign weights to each user interaction history subsequence through a cross-attention mechanism. Finally, each user interaction subsequence is weighted to obtain the vector representation of the user interaction history. : Negative sample items are selected, and their vector representations, along with the user interaction history encoding vector representation and the actual interaction item vector representation, form a vector space. The correlation between the user interaction history and the positive and negative sample items is calculated. Based on the correlation, the loss value between the predicted item and the actual interaction item of the language model is obtained. The parameters of the language model are trained by comparing and learning using the cross-entropy loss function, optimizing the vector space of the language model, bringing the user interaction history and the positive sample item vector representation in the vector space closer together, and separating the user interaction history and the negative sample item vector representation. Finally, the trained language model is obtained and used to predict the next recommended item.

2. The serialized recommendation method based on long user behavior according to claim 1, characterized in that, The user interaction history and the corresponding next-moment real interaction product are obtained as follows: user consumption records are collected as user product interaction records. Each user product interaction record includes a user and product interaction sequence. Each product in the product interaction sequence includes product attributes and the time when the user interacted with the product. The user product interaction records are sorted according to the user's interaction time. The user interaction products at the first t-1 moments are used as the user interaction history, and the product at moment t is used as the corresponding next-moment real interaction product. The negative sample products are negative samples within the batch or randomly sampled products that the user has not interacted with.

3. The serialized recommendation method based on long user behavior according to claim 1 or 2, characterized in that, The process of obtaining the product vector representation is as follows: Product vector representation based on prompting learning; product text representation is encoded into product vector representation by an encoder. : For encoder, For decoder, It is the vector representation of the first word input to the decoder.

4. The serialized recommendation method based on long user behavior according to claim 3, characterized in that, The relevance calculation is as follows: the dot product of the user interaction history vector representation and the positive and negative example product vector representations is calculated, and the final relevance score is obtained through a normalized exponential function: .

5. The serialized recommendation method based on long user behavior according to claim 4, characterized in that, The loss value is calculated as follows: the cross-entropy loss function is used to calculate the predicted product and the actual product interacted with by the user at time t. Loss value between: .

6. A serialized recommendation device based on long user behavior, characterized in that, include: The memory is used for new language models, storing interaction history sequences and their corresponding real-time interactive products; The new language model, based on an improvement of an existing language model, includes an encoder, a decoder, and a loss function. Both the encoder and decoder are based on improvements to the standard Transformer structure, and their initial parameters are initialized using the weights of a pre-trained existing language model. The encoder employs attention sparsity and pruning for its self-attention mechanism. During the encoding process, the encoder independently encodes the input value into segments to obtain segment encoding vectors. During the decoding process, the decoder redistributes weights to each segment encoding vector and fuses them. A processor is configured to execute a computer program stored in the memory, wherein, when the computer program is executed, the processor is configured to: After dividing the user interaction history sequence into multiple user interaction history subsequences, the subsequences are input into the encoder to obtain the encoded vector representation of the user interaction history, and then decoded by the decoder to obtain the vector representation of the user interaction history. The actual interactive product corresponding to the user's interaction history at the next moment is encoded and decoded to obtain a product vector representation; The vector space is composed of the user interaction history encoded vector representation, the real interaction product vector representation, and the negative sample product vector representation. The correlation between the user interaction history and the positive sample products and the negative sample products is calculated respectively. Based on the correlation, the loss value between the predicted product of the language model and the real interaction product is obtained. The parameters of the language model are trained by comparing and learning using the cross-entropy loss function, optimizing the vector space of the language model, bringing the user interaction history and the positive sample product vector representation in the vector space closer, separating the user interaction history and the negative sample product vector representation, and finally obtaining the trained language model, which is used to predict the next recommended product. The serialization recommendation device based on long user behavior can execute the serialization recommendation method based on long user behavior as described in any one of claims 1 to 5.

7. A storage medium, characterized in that, A computer program is stored, which can be executed by a processor to implement the serialization recommendation method as described in any one of claims 1-5.