Data labeling method for surprise degree recommendation and related device and product
By preprocessing user comment text and performing initial annotation on a large model, combined with the prediction loss verification of a small-scale language model, the problem of the lack of explicit surprise annotations in traditional datasets is solved, enabling the construction of an efficient and accurate surprise dataset to support the training and research of surprise recommendation models.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN UNIV
- Filing Date
- 2026-02-11
- Publication Date
- 2026-06-12
Smart Images

Figure CN122196262A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of recommendation, and more particularly to a data annotation method and related apparatus and products for recommendation based on surprise factor. Background Technology
[0002] The core objective of surprise-oriented recommendation is to present users with content that exceeds their expectations while still remaining relevant and valuable, thus creating an unexpected positive experience, while meeting their basic preferences and core needs. It emphasizes finding the optimal balance between "acceptability" and "novelty," ensuring that recommendations don't stray too far from user interests, leading to abruptness, while effectively encouraging users to explore new content areas, broaden their horizons, and stimulate potential interests. However, current research on surprise-oriented recommendation still primarily relies on rating, behavioral log, and comment datasets from traditional recommendation systems. These datasets often lack explicit annotations of surprise experiences, making it difficult to accurately capture users' genuine surprise and novelty evaluations of the recommendation results. Summary of the Invention
[0003] To address the technical problem of existing recommendation datasets lacking explicit annotations of surprise experiences, embodiments of the present invention provide a data annotation method and related apparatus and products for surprise-oriented recommendations.
[0004] The first aspect of this invention provides a data annotation method for surprise-oriented recommendations, comprising: Preprocess user comment text to obtain target comment text containing surprise information; A large model is used to annotate the target comment text to obtain preliminary annotation results, which include surprise tags and surprise aspects. The preliminary annotation results and their corresponding target comment texts are used to train a small-scale language model, and the prediction loss of the small-scale language model for the target comment text is calculated. If the predicted loss is less than or equal to a preset threshold, the preliminary labeling result is output as the target labeling result.
[0005] A second aspect of the present invention provides a data annotation device for surprise-oriented recommendations, comprising: The preprocessing unit is used to preprocess user comment text to obtain target comment text containing surprise information; The annotation unit is used to annotate the target comment text using a large model to obtain preliminary annotation results, which include surprise tags and surprise aspects; The training unit is used to train a small-scale language model using the preliminary annotation results and their corresponding target comment text, and to calculate the prediction loss of the small-scale language model for the target comment text; The judgment unit is used to output the preliminary labeling result as the target labeling result if the prediction loss is less than or equal to a preset threshold.
[0006] A third aspect of the present invention provides a computer storage medium including instructions that, when executed on a computer, cause the computer to perform the steps of the data annotation method for surprise-oriented recommendations described in the first aspect.
[0007] A fourth aspect of the present invention provides a computer program product, including instructions that, when executed by a processor, implement the steps of the data annotation method for surprise-oriented recommendations described in the first aspect.
[0008] A fifth aspect of the present invention provides a computer device comprising at least one connected processor, a memory, and a transceiver, wherein the memory is used to store program code, and the processor is used to invoke the program code in the memory to execute the steps of the data annotation method for surprise-oriented recommendation described in the first aspect.
[0009] Compared to related technologies, the data annotation method for surprise-oriented recommendations proposed in this invention first preprocesses user comments to filter out target comment texts that may contain surprise experiences. Based on this, a large model is used to automatically perform preliminary surprise-oriented annotation on the target comment texts, significantly reducing the workload of manual annotation. To further improve annotation quality, this method introduces an active learning mechanism, using a small model to learn from the automatic annotation results of the large model. The accuracy of the annotation is verified based on prediction loss, and accurate preliminary annotation results are output as target annotation results, thereby efficiently constructing a high-quality, highly reliable surprise-oriented annotation dataset. Attached Figure Description
[0010] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort, wherein: Figure 1 This is a flowchart illustrating a data annotation method for surprise-oriented recommendations provided in an embodiment of the present invention; Figure 2 This is a schematic diagram illustrating the application of the data annotation method for surprise-oriented recommendations provided in this embodiment of the invention; Figure 3 This is a schematic diagram of instruction fine-tuning provided in an embodiment of the present invention; Figure 4 This is a schematic diagram of a context learning template provided in an embodiment of the present invention; Figure 5 This is a virtual structural diagram of the data annotation device for surprise-oriented recommendations provided in an embodiment of the present invention; Figure 6 This is a schematic diagram of the structure of a server provided by the present invention. Detailed Implementation
[0011] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0012] Surprise-based recommendation relies on comment datasets. Constructing a dataset containing explicit surprise labels accurately captures users' genuine surprise and novelty evaluations of the recommendation results, providing a more reliable and richer data foundation for the training, validation, and related research of surprise-based recommendation models. Relying on manual annotation of user comment text leads to high costs and low efficiency. Furthermore, the resulting datasets are often limited in quantity and inconsistent in quality due to limitations in the scale and consistency of manual annotation, making it difficult to meet the needs of deep recommendation models for large-scale, high-precision training data.
[0013] Therefore, embodiments of the present invention provide a data annotation method and related apparatus and products for surprise-oriented recommendation. By collecting comments from multiple platforms and combining keywords and sentiment analysis to filter texts that may contain surprise experiences, a large model is used for automated preliminary annotation, and verification and optimization are carried out under an active learning mechanism. This allows for the construction of a high-quality and reliable surprise-oriented annotation dataset at a low cost, which can be used for the training and research of surprise-oriented recommendation models.
[0014] The following section describes the data annotation method for surprise-oriented recommendations from the perspective of a data annotation device for surprise-oriented recommendations. This data annotation device for surprise-oriented recommendations can be a server or a service unit within a server, without any specific limitation.
[0015] Please see Figure 1 and Figure 2 ,in, Figure 1 This is a flowchart illustrating a data annotation method for surprise-oriented recommendations provided in an embodiment of the present invention. Figure 2 This is a schematic diagram illustrating the application of the surprise-oriented recommendation data annotation method provided in this embodiment of the invention. The surprise-oriented recommendation data annotation method includes: 101. Preprocess the user comment text to obtain the target comment text containing surprise information.
[0016] In this embodiment, the data labeling device for surprise-oriented recommendations can collect a large amount of user comment text from various platforms according to user instructions, or it can directly receive user-inputted user comment text. It then filters these raw user comment texts to obtain target comment texts containing surprise information. Specifically, surprise information can include information such as surprise experience, surprise emotion, surprise mood, positive emotion, or unexpected positive experience.
[0017] Understandably, users can specify the platforms for collecting user review texts based on their needs, such as Amazon, Yelp, and IMDb. These datasets contain rich user feedback information and can cover different fields and user groups, providing more diverse samples.
[0018] It's also understandable that the data annotation device for surprise-oriented recommendations preprocesses user comment texts to filter out target comment texts containing surprise information. Since objective filtering methods cannot guarantee 100% accuracy, what is actually obtained are only comment texts that may contain surprise information. The main purpose of this preprocessing is to increase the surprise density of the dataset composed of target comment texts, making it higher than the dataset composed of the original user comment texts. Of course, in practice, there is still a possibility that the final target comment text may not contain any surprise information.
[0019] In some embodiments, user comments are preprocessed to obtain target comment text containing surprise information, including: Based on preset surprise keywords and sentiment analysis technology, target comment texts are filtered out from user comment texts.
[0020] In this embodiment, the data labeling device for surprise-oriented recommendations can set surprise-related keywords as needed, such as "serendipity," "by chance," "unexpected," and "surprise." Based on these surprise keywords, existing filtering methods are used to select user comment texts that may contain surprise information. To further improve coverage, the device can also use existing sentiment analysis methods to analyze unselected comment texts, retaining those with positive sentiment scores, as these may implicitly contain surprise information, thereby expanding the number of target comment texts. The filtering based on surprise keywords can employ keyword matching methods, specifically keyword matching based on regular expressions, semantic expansion retrieval based on word vectors, or a classifier combining surprise-related pattern rules. Sentiment analysis methods can use dictionary-based methods (such as VADER) or deep learning-based sentiment analysis models (such as BERT, LSTM, etc.) to identify and retain comment texts with positive sentiment polarity.
[0021] 102. Use a large model to annotate the target comment text to obtain preliminary annotation results.
[0022] In this embodiment, after obtaining the target comment text, the data annotation device for surprise-oriented recommendations can use a large model to annotate the target comment text to obtain preliminary annotation results. These preliminary annotation results include surprise tags and surprise aspects. Surprise tags are used to identify whether the target comment text contains surprise information; for example, 1 indicates surprise and 0 indicates no surprise. Surprise aspects are used to identify specific and overall aspects of the commented object in the target comment text. For example, for a movie review, specific aspects might include "plot," "actor performance," "special effects," and "soundtrack."
[0023] In this embodiment, the large model can be selected from one of the following two methods or used in combination: one is a surprise level annotation model obtained through instruction fine-tuning, and the other is a large language model that supports context learning. These two models will be described below.
[0024] Instruction fine-tuning is a method of adjusting the parameters of a pre-trained large model using specific task instructions, aiming to improve the model's performance on a specific task. In this embodiment, the purpose of instruction fine-tuning is to enable the large model to recognize whether user comment text contains the emotion of "surprise". Before fine-tuning, a comment dataset suitable for this task needs to be constructed, and the large model is fine-tuned using this data. Therefore, instruction fine-tuning of the large model includes: Aspect-level preprocessing is performed on the training comment text to obtain the training annotation results corresponding to the training comment text; Construct instruction-response pairs based on the training comment text and its corresponding training annotation results; Use command-response pairs to adjust the parameters of large models.
[0025] In this embodiment, the training annotation results also include surprise labels and surprise aspects. The training comment text can come from the public dataset SerenLens, because the SerenLens dataset contains surprise labels, which can quickly and accurately obtain the training annotation results. Aspect-level preprocessing is performed on the training comment text to obtain the training annotation results corresponding to the training comment text, including: Based on the preset number of aspects and the initial seed feature words of aspects, the aspect-related words are iteratively expanded through a self-starting algorithm to obtain a feature word set; The feature word set and the training comment text are matched to obtain the training annotation results corresponding to the training comment text.
[0026] Users can preset the number of aspects as needed and select the most representative words for each aspect from the review text in the dataset as initial aspect seed feature words. For example, for a training review text, "The plot of this movie was somewhat unexpected. During the viewing process, the unexpected twists and turns and delicate emotional portrayals didn't deliberately create exaggerated surprises, but unexpectedly, they brightened my eyes and stirred a long-lost feeling in my heart," the number of aspects can be set to three: director, screenwriter, and plot. Initial aspect seed feature words can be assigned to each aspect: directing, director, filming; script, writing, story; twist, plot, development. When iteratively expanding using an automatic algorithm, the algorithm starts with the initial seed words, such as director, and scans the entire dataset. It finds words such as directing, filming, and shots that frequently appear in similar contexts and adds them to the aspect word set for director. After expansion, the feature word sets for each aspect are matched with the current training review text, and combined with the existing surprise labels in the dataset, the training annotation results corresponding to the review text can be obtained. For example, the matching result might show that the comment expresses surprise in terms of the director, thus obtaining the training annotation result: overall: 1 (director: 0; screenwriter: 0; plot: 1).
[0027] After obtaining the training annotation results, the training comment text, having undergone aspect-level preprocessing, is obtained, allowing the construction of instruction-response pairs. Based on the training comment text, "The plot of this movie was somewhat unexpected. During the viewing process, the unexpected twists and turns and the delicate emotional portrayals—it didn't deliberately create exaggerated surprises, but rather, unexpectedly, brightened my eyes and stirred a long-lost feeling of emotion within me," we can obtain... Figure 3 The instructions shown are correct.
[0028] By inputting a large number of command-response pairs into a large model, the model continuously compares its output with the training annotations during training and adjusts its internal parameters accordingly. This makes its output increasingly resemble the format and logic of the training annotations, thus fine-tuning the large model. In this way, the fine-tuned model can automatically annotate new user comment text, efficiently identifying surprising comments that meet user expectations.
[0029] Contextual learning is a learning paradigm that allows models to quickly adapt to new tasks using only examples or task descriptions in the context, without explicit parameter updates or fine-tuning. It is commonly used in large-scale pre-trained language models (LLMs) to enable models to understand the structure and rules of a task and to reason and predict on unseen data by providing a small number of input-output examples or task descriptions.
[0030] The large model employs a contextual learning approach to annotate user comment texts at both the overall and aspect levels to determine whether they are surprising or not. Aspect-level preprocessing is also required, which will not be elaborated upon here. Then, several training comment texts and their corresponding annotation results are used as annotation examples for contextual learning. A contextual learning template is constructed, including task instructions, annotation examples, and the user comment text to be annotated, such as... Figure 4 As shown, when a user enters new user comment text, the large model calls the context-learned template and automatically annotates the new user comment text.
[0031] Understandably, by combining instruction fine-tuning and contextual learning, large models can effectively process new comments and accurately label surprise levels.
[0032] 103. Use the preliminary annotation results and their corresponding target comment text to train a small-scale language model, and calculate the prediction loss of the small-scale language model for the target comment text.
[0033] 104. If the predicted loss is less than or equal to the preset threshold, the preliminary labeling result is output as the target labeling result.
[0034] In this embodiment, after obtaining the target comment text, the data annotation device for surprise-oriented recommendations can train a small-scale language model (SLM) using the preliminary annotation results and their corresponding text. By comparing the annotation differences between the large model and the SLM, the reliability of the preliminary annotation results of the large model is evaluated. If the annotation of the large model is clear, correct, and conforms to the underlying rules, the small-scale language model should be able to learn the corresponding annotation rules from the data relatively easily; conversely, if the annotation is contradictory, ambiguous, or erroneous, the SLM will have difficulty establishing a stable mapping relationship from text to labels, exhibiting a high learning loss. Therefore, the loss value of the small-scale language model during the learning process can serve as an effective indicator reflecting the annotation quality of the large model.
[0035] Understandably, a small-scale language model refers to a language model with a relatively small number of parameters. Compared to large models, small-scale language models have lower computational costs, faster inference speeds, and can be trained and evaluated quickly.
[0036] In this embodiment, calculating the prediction loss of the small-scale language model for the target comment text specifically includes: Calculate the cross-entropy loss corresponding to the target comment text: , in, This represents the text of the i-th target comment. Indicates the target comment text Preliminary annotation results This indicates a small-scale language model for the target comment text. The predicted probability.
[0037] Understandably, the above formula calculates the degree of inconsistency between the annotations of the small-scale language model and the large-scale model. A larger prediction loss indicates a greater difference between the annotations of the small-scale language model and the large-scale model, and a higher probability that the initial annotation results of the large-scale model are incorrect; conversely, a smaller prediction loss indicates a lower probability that the initial annotation results of the large-scale model are incorrect. Therefore, a preset threshold can be set. When the prediction loss is less than or equal to this threshold, the data annotation device based on surprise factor recommendation will determine the corresponding initial annotation result as reliable and output it as the target annotation result.
[0038] In a further embodiment, the data annotation method for surprise-oriented recommendations also includes: Clustering is performed on the clean sample set to construct a high-quality demonstration sample set. The clean sample set is the collection of target annotation results and their corresponding target comment texts, while the high-quality demonstration sample set is the collection of the center samples of each cluster. Optimize large models using high-quality demo sample sets.
[0039] Understandably, target annotation results validated by small-scale language models exhibit high annotation accuracy, and the clean sample set constructed based on this can effectively support the optimization of large models. A data annotation device for surprise-oriented recommendations can classify samples in the clean sample set according to surprise labels. For example, let j represent the category index, j=0 indicating surprise, j=1 indicating no surprise. A high-quality demo sample set can be represented as: , in, For the set of all categories, Indicates category The central sample set below.
[0040] In some embodiments, the data annotation method for surprise-oriented recommendations also includes: If the predicted loss is greater than the preset threshold, the target comment text corresponding to the preliminary annotation result is determined to be a noise sample.
[0041] Understandably, for preliminary annotation results where the predicted loss is less than or equal to the preset threshold, the annotation results are considered reliable; for preliminary annotation results where the predicted loss is greater than the preset threshold, the annotation results are considered unreliable and are classified as noise samples.
[0042] For noisy samples, to enhance the stability of training small-scale language models, a consistency loss is added to the data annotation device for surprise-oriented recommendations, enabling small-scale language models to maintain stable predictions even on noisy data. Therefore, data annotation methods for surprise-oriented recommendations also include: Using data augmentation techniques to enhance the target comment text Perturbation generates corresponding variants ; A small-scale language model is trained using a training loss function, which is: , , , , in, For a clean sample set, For a noise sample set, It is a small-scale language model in Cross-entropy loss on The cross-entropy loss is the result of perturbing clean samples. Let KL divergence be the result of perturbing the noisy samples. The target comment text The preliminary annotation results, S() represents the predicted probability of the small-scale language model, l ce Let l be the cross-entropy loss function. kl Let α be the KL divergence, and α be the weighting coefficients of the two auxiliary losses.
[0043] Specifically, the data annotation device for surprise-oriented recommendations uses data augmentation techniques to enhance the target comment text. Perturbation is performed to generate corresponding variants. Different loss calculation strategies are employed for clean and noisy samples: For clean samples, calculate their variants. Cross-entropy loss This makes the model more responsive to the enhanced variants. It can still output reliable preliminary annotation results. Consistent predictions enhance the model's robustness to fitting accurate labels; For noise samples, due to their initial annotation results The reliability is low, so the model is not forced to fit the label; instead, the target comment text is calculated. With variants KL divergence loss predicted by a small-scale language model The constraint model maintains consistency in the predicted distribution before and after the perturbation.
[0044] Finally, the two losses mentioned above are combined with the cross-entropy loss. Combined, they form the overall training objective. This method enhances the model's ability to identify reliable annotations and its stability against noisy annotations, thereby improving overall annotation quality and generalization performance.
[0045] In a further embodiment, optimizing the large model using a high-quality demonstration sample set includes: Fine-tuning of large models using high-quality demo sample sets; and / or High-quality demonstration sample sets are used as labeled examples for contextual learning of large models.
[0046] It is understandable that, as described in step 102, the large model can be selected using one of the following two methods or a combination thereof: one is a surprise-level labeled model obtained through instruction fine-tuning, and the other is a large language model that supports context learning. After obtaining a high-quality demonstration sample set, the surprise-level recommendation data labeling device can use this sample set to optimize the large model through instruction fine-tuning, or it can use this sample set as a labeling example to construct a context learning template, thereby optimizing the large model. The relevant implementation details have been explained in step 102 and will not be repeated here.
[0047] In a further embodiment, the data annotation method for surprise-oriented recommendations also includes: The optimized large model was used to label the noise samples to obtain preliminary labeling results.
[0048] Understandably, the initial annotation results of the large model before optimization may be unreliable for these noisy samples. Therefore, the optimized large model can be used to re-annotate them to obtain new initial annotation results, and the steps of calculating the prediction loss and making judgments based on preset thresholds can be performed again. This process can be iterative until a preset stopping condition is met. The preset stopping condition can be obtaining reliable annotation results or meeting the required number of iterations; those skilled in the art can set it according to actual needs, and there is no specific limitation here.
[0049] Compared to related technologies, this invention proposes a data annotation method for surprise-oriented recommendation. This method first collects user reviews from multiple platforms, and then filters target review texts that may contain surprise experiences by constructing keyword rules and combining them with sentiment analysis technology. Based on this, it utilizes the instruction fine-tuning and context learning capabilities of a large model to automatically perform preliminary surprise-oriented annotation on the target review texts, significantly reducing the workload of manual annotation. To further improve annotation quality, this method introduces an active learning mechanism, using a small model to learn from and iteratively optimize the automatic annotation results of the large model, thereby ensuring the accuracy and consistency of the annotation. Ultimately, this method can efficiently construct a high-quality, highly reliable surprise-oriented annotation dataset, providing a reliable data foundation for the training, evaluation, and related research of surprise-oriented recommendation models.
[0050] The present invention has been described above from the perspective of a data annotation method for surprise-oriented recommendations. The present invention will now be described below from the perspective of a data annotation device for surprise-oriented recommendations.
[0051] Please see Figure 5 This is a virtual structural diagram of a data annotation device for surprise-oriented recommendations provided in an embodiment of the present invention. The subgraph pattern monitoring device 200 includes: Preprocessing unit 201 is used to preprocess user comment text to obtain target comment text containing surprise information; Annotation unit 202 is used to annotate the target comment text using a large model to obtain preliminary annotation results, which include surprise tags and surprise aspects; Training unit 203 is used to train a small-scale language model using the initial annotation results and their corresponding target comment text, and to calculate the prediction loss of the small-scale language model for the target comment text. The judgment unit 204 is used to output the preliminary labeling result as the target labeling result if the predicted loss is less than or equal to a preset threshold.
[0052] In one possible design, the preprocessing unit 201 is specifically used for: Based on preset surprise keywords and sentiment analysis technology, target comment texts are filtered out from user comment texts.
[0053] One possible design also includes a fine-tuning unit 205, which is used for: Aspect-level preprocessing is performed on the training comment text to obtain the training annotation results corresponding to the training comment text; Construct instruction-response pairs based on the training comment text and its corresponding training annotation results; Use command-response pairs to adjust the parameters of large models.
[0054] One possible design also includes a context learning unit 206, which is used for: Based on the preset number of aspects and the initial seed feature words of aspects, the aspect-related words are iteratively expanded through a self-starting algorithm to obtain a feature word set; The feature word set and the training comment text are matched to obtain the training annotation results corresponding to the training comment text.
[0055] In one possible design, training unit 203 is specifically used for: Calculate the cross-entropy loss corresponding to the target comment text: , in, This represents the text of the i-th target comment. Indicates the target comment text Preliminary annotation results This indicates a small-scale language model for the target comment text. The predicted probability.
[0056] One possible design also includes an optimization unit 207, which is used to: Clustering is performed on the clean sample set to construct a high-quality demonstration sample set. The clean sample set is the collection of target annotation results and their corresponding target comment texts, while the high-quality demonstration sample set is the collection of the center samples of each cluster. Optimize large models using high-quality demo sample sets.
[0057] In one possible design, the decision unit 204 is also used for: If the predicted loss is greater than the preset threshold, the target comment text corresponding to the preliminary annotation result is determined to be a noise sample.
[0058] In one possible design, training unit 203 is also used for: Using data augmentation techniques to enhance the target comment text Perturbation generates corresponding variants ; A small-scale language model is trained using a training loss function, which is: , , ,
[0059] in, For a clean sample set, For a noise sample set, It is a small-scale language model in Cross-entropy loss on The cross-entropy loss is the result of perturbing clean samples. Let KL divergence be the result of perturbing the noisy samples. The target comment text The preliminary annotation results, S() represents the predicted probability of the small-scale language model, l ce Let l be the cross-entropy loss function. kl Let α be the KL divergence, and α be the weighting coefficients of the two auxiliary losses.
[0060] In one possible design, the optimization unit 207 is also used for: Fine-tuning of large models using high-quality demo sample sets; and / or High-quality demonstration sample sets are used as labeled examples for contextual learning of large models.
[0061] In one possible design, annotation unit 202 is also used for: The optimized large model was used to label the noise samples to obtain preliminary labeling results.
[0062] This invention also provides a server, such as... Figure 6 As shown, the server in this embodiment The server 300 in this embodiment includes at least one processor 301, at least one network interface 304 or other user interface 303, a memory 305, and at least one communication bus 302. The server 300 may optionally include a display, keyboard, or clicking device. The memory 305 may include high-speed RAM or non-volatile memory, such as at least one disk drive. The memory 305 stores execution instructions. When the server 300 is running, the processor 301 communicates with the memory 305, and the processor 301 calls the instructions stored in the memory 305 to execute the aforementioned data annotation method recommended for surprise factor. The operating system 306 contains various programs for implementing various basic services and handling hardware-based tasks.
[0063] The server provided in this embodiment of the invention can execute the technical solution of the above-described embodiment of the data annotation method for surprise-oriented recommendations. Its implementation principle and technical effect are similar, and will not be repeated here.
[0064] This invention also provides a computer-readable storage medium storing a computer program that, when executed by a computer, implements the method flow related to the data annotation device for surprise-oriented recommendations in any of the above method embodiments. Correspondingly, the computer can be the aforementioned data annotation device for surprise-oriented recommendations.
[0065] This invention also provides a computer program or a computer program product including a computer program, which, when executed on a computer, causes the computer to implement the method flow related to the data annotation device for surprise-oriented recommendations in any of the above method embodiments. Correspondingly, the computer can be the aforementioned data annotation device for surprise-oriented recommendations.
[0066] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product.
[0067] The above description is merely an embodiment of the present invention and does not limit the patent scope of the present invention. Any equivalent structural or procedural transformations made based on the content of the present invention specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of the present invention.
Claims
1. A data annotation method for surprise-oriented recommendation, characterized in that, The method includes: Preprocess user comment text to obtain target comment text containing surprise information; A large model is used to annotate the target comment text to obtain preliminary annotation results, which include surprise tags and surprise aspects. The preliminary annotation results and their corresponding target comment texts are used to train a small-scale language model, and the prediction loss of the small-scale language model for the target comment text is calculated. If the predicted loss is less than or equal to a preset threshold, the preliminary labeling result is output as the target labeling result.
2. The method according to claim 1, characterized in that, The preprocessing of user comments to obtain target comment text containing surprise information includes: Based on preset surprise keywords and sentiment analysis technology, the target comment text is filtered out from the user comment text.
3. The method according to claim 1, characterized in that, Also includes: Clustering is performed on the samples in the clean sample set to construct a high-quality demonstration sample set, wherein the clean sample set is the set of the target annotation results and the corresponding target comment text, and the high-quality demonstration sample set is the set of the center samples of each cluster; The large model is optimized using the high-quality demonstration sample set.
4. The method according to claim 3, characterized in that, The optimization of the large model using the high-quality demonstration sample set includes: The large model is fine-tuned using the high-quality demo sample set; and / or The high-quality demonstration sample set is used as an labeled example for the context learning of the large model.
5. The method according to claim 1, characterized in that, The calculation of the prediction loss of the small-scale language model for the target comment text includes: The cross-entropy loss corresponding to the target comment text is calculated using the following formula: , in, This represents the text of the i-th target comment. Indicates the target comment text The preliminary annotation results, This indicates a small-scale language model for the target comment text. The predicted probability.
6. The method according to claim 1, characterized in that, Also includes: If the predicted loss is greater than a preset threshold, then the target comment text corresponding to the preliminary annotation result is determined to be a noise sample.
7. The method according to claim 6, characterized in that, Using the preliminary annotation results and their corresponding target comment text, a small-scale language model is trained, including: Using data augmentation techniques to enhance the target comment text Perturbation generates corresponding variants ; The small-scale language model is trained using a training loss function, which is: , , , in, For a clean sample set, For a noise sample set, It is a small-scale language model in Cross-entropy loss on The cross-entropy loss is the result of perturbing clean samples. Let KL divergence be the result of perturbing the noisy samples. The target comment text The preliminary annotation results, S() represents the predicted probability of the small-scale language model, l ce Let l be the cross-entropy loss function. kl Let α be the KL divergence and α be the weighting coefficient.
8. A data annotation device for surprise-oriented recommendations, characterized in that, include: The preprocessing unit is used to preprocess user comment text to obtain target comment text containing surprise information; The annotation unit is used to annotate the target comment text using a large model to obtain preliminary annotation results, which include surprise tags and surprise aspects; The training unit is used to train a small-scale language model using the preliminary annotation results and their corresponding target comment text, and to calculate the prediction loss of the small-scale language model for the target comment text; The judgment unit is used to output the preliminary labeling result as the target labeling result if the prediction loss is less than or equal to a preset threshold.
9. A storage medium, characterized in that, The storage medium stores computer-executable instructions, which are executed by a processing unit to implement the data annotation method for surprise-oriented recommendations according to any one of claims 1-7.
10. A computer program product, characterized in that, It includes computer-executable instructions, which, when executed by a processing unit, implement the data annotation method for surprise-oriented recommendations as described in any one of claims 1-7.