Model training method and apparatus

By training a generative adversarial network model, and using adversarial generators and discriminators to generate semantic vectors with significant differences in the summary, the reliability and fluency issues of summary generation in existing technologies are solved, and summary generation that can quickly capture important content when dealing with a large amount of information is achieved.

CN115409172BActive Publication Date: 2026-06-26VIVO MOBILE COMM CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
VIVO MOBILE COMM CO LTD
Filing Date
2022-09-26
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

How to effectively generate summaries, especially when quickly capturing important content with a large amount of information, is a challenge that current technologies struggle to guarantee in terms of both reliability and fluency.

Method used

By training a generative adversarial network model, adversarial generators are used to generate semantic vectors of summaries with large differences and perturbed semantic vectors of summaries. These are then classified by a discriminator. The model is optimized by combining cross-entropy loss to ensure that the generated summary information remains stable when there are small changes in the text data.

Benefits of technology

The generated summary information can produce different summaries based on different comments, and does not fluctuate significantly when the comments change slightly, ensuring the reliability and fluency of the summaries.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115409172B_ABST
    Figure CN115409172B_ABST
Patent Text Reader

Abstract

The application discloses a model training method and device, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring multiple groups of training samples, wherein the training samples comprise N training sub-samples; inputting the training sub-samples in the training samples into an adversarial generator in a preset generative adversarial network model, and outputting an abstract semantic vector and a perturbed abstract semantic vector; inputting the abstract semantic vector and the perturbed abstract semantic vector into a discriminator in the preset generative adversarial network model, and outputting positive sample data and negative sample data; outputting an abstract information sample based on the perturbed abstract semantic vector in the positive sample data; and adjusting the preset generative adversarial network model according to a generator loss, a discriminator loss and a cross-entropy loss until the preset generative adversarial network model meets a preset training condition, so as to obtain an abstract generation model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of artificial intelligence technology, specifically relating to a model training method and apparatus. Background Technology

[0002] With the continuous development of information technology, people receive a huge amount of information every day, making it difficult for users to effectively extract the main content. Summary can help users quickly capture important content and save reading costs.

[0003] Therefore, how to effectively generate summaries has become an urgent problem to be solved in the industry. Summary of the Invention

[0004] The purpose of this application is to provide a model training method and apparatus that can solve the problem of how to effectively generate summaries.

[0005] In a first aspect, embodiments of this application provide a model training method, the method comprising:

[0006] Multiple sets of training samples are obtained. Each training sample contains N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same.

[0007] The training sub-samples in the training samples are input into the adversarial generator in the preset generative adversarial network model, and the output is a summary semantic vector and a perturbation summary semantic vector.

[0008] The summary semantic vectors and perturbation summary semantic vectors corresponding to the N training sub-samples are input into the discriminator in the preset generative adversarial network model, and positive sample data and negative sample data are output. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by one training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples, where N is a positive integer and M is a positive integer greater than 2.

[0009] Based on the perturbed summary semantic vector in the positive sample data, output summary information samples;

[0010] The preset generative adversarial network model is adjusted based on the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thus obtaining the summary generation model.

[0011] The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector. The discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data. The cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples. The summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data.

[0012] Secondly, embodiments of this application provide a model training apparatus, including:

[0013] The acquisition module is used to acquire multiple sets of training samples. The training samples contain N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same.

[0014] The first output module is used to input the training sub-samples in the training samples into the adversarial generator in the preset generative adversarial network model, and output the summary semantic vector and the perturbation summary semantic vector.

[0015] The second output module is used to input the summary semantic vector and perturbation summary semantic vector corresponding to the N training sub-samples into the discriminator in the preset generative adversarial network model, and output positive sample data and negative sample data. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by one training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples, where N is a positive integer and M is a positive integer greater than 2.

[0016] The third output module is used to output a summary information sample based on the perturbed summary semantic vector in the positive sample data;

[0017] An adjustment module is used to adjust the preset generative adversarial network model based on the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thereby obtaining a summary generation model.

[0018] The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector. The discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data. The cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples. The summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data.

[0019] Thirdly, embodiments of this application provide an electronic device including a processor and a memory, wherein the memory stores programs or instructions executable on the processor, and the programs or instructions, when executed by the processor, implement the steps of the method described in the first aspect.

[0020] Fourthly, embodiments of this application provide a readable storage medium on which a program or instructions are stored, which, when executed by a processor, implement the steps of the method described in the first aspect.

[0021] Fifthly, embodiments of this application provide a chip, the chip including a processor and a communication interface, the communication interface being coupled to the processor, the processor being used to run programs or instructions to implement the method as described in the first aspect.

[0022] In a sixth aspect, embodiments of this application provide a computer program product stored in a storage medium, which is executed by at least one processor to implement the method described in the first aspect.

[0023] In this embodiment, a pre-defined generative adversarial network (GAN) model is trained using training sub-samples consisting of main text sample data and corresponding main text comment data samples. The generator in the model generates summary semantic vectors and perturbation summary semantic vectors with as large a difference as possible based on the training sub-samples. The discriminator in the model tries to group the summary semantic vectors and perturbation summary semantic vectors generated from the same training sub-sample together as positive sample data, and treats the data generated from different training samples as negative sample data. This allows the generator and discriminator to optimize against each other. Furthermore, the model is further adjusted by comparing the summary labels in the training sub-samples with the summary information samples generated from the summary semantic vectors in the positive sample data. The trained summary generation model can generate different summary information according to different main text comment data corresponding to the main text data, and the generated summary information will not fluctuate significantly due to small changes in the main text data, ensuring the reliability of the generated summary information. Attached Figure Description

[0024] Figure 1 This is a schematic diagram of the model training method provided in the embodiments of this application;

[0025] Figure 2 A schematic diagram of an adversarial learning generator provided in an embodiment of this application;

[0026] Figure 3 A schematic diagram of the adversarial learning framework provided in the embodiments of this application;

[0027] Figure 4A schematic diagram illustrating the changes in the decoder structure provided in the embodiments of this application;

[0028] Figure 5 This is a schematic diagram of the abstract generation process provided in the embodiments of this application;

[0029] Figure 6 This is a schematic diagram of the model training device structure provided in the embodiments of this application;

[0030] Figure 7 This is a schematic diagram of the electronic device structure provided in the embodiments of this application;

[0031] Figure 8 A schematic diagram of the hardware structure of an electronic device to implement an embodiment of this application. Detailed Implementation

[0032] The technical solutions of the embodiments of this application will be clearly described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.

[0033] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such use of data can be interchanged where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first," "second," etc., are generally of the same class and the number of objects is not limited; for example, a first object can be one or more. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.

[0034] The model training method and apparatus provided in this application will be described in detail below with reference to the accompanying drawings, through specific embodiments and application scenarios.

[0035] Figure 1 This is a schematic diagram of the model training method provided in the embodiments of this application, such as... Figure 1 As shown, it includes:

[0036] Step 110: Obtain multiple sets of training samples. The training samples contain N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same.

[0037] Specifically, the text data sample described in the embodiments of this application refers to the text data sample, such as the text data in a news article or the text data in a paper.

[0038] The text comment data samples described in the embodiments of this application refer to samples of comments published in the text, such as comment data for news articles, or review data for comments.

[0039] The training samples described in this application embodiment can be obtained from a pre-collected dataset containing text comment data samples and text data samples, such as news data containing comments.

[0040] In this embodiment of the application, each text data sample corresponds to N text comment data samples. A text data sample and a text comment data sample are used as a training subsample, and the training subsample also includes the corresponding summary label.

[0041] In this embodiment of the application, each training sample contains N training sub-samples. The text data samples of all training sub-samples in the same training sample are the same, while the text comment data samples of each training sub-sample can be different.

[0042] For example, the obtained sample dataset includes: text data sample A, which corresponds to text comment data sample A, text comment data sample B, and text comment data sample C. In this case, the training samples corresponding to this sample dataset are: text data sample A - text comment data sample A - summary label A, text data sample A - text comment data sample B - summary label B, and text data sample A - text comment data sample C - summary label C.

[0043] In this embodiment of the application, the summary label of each training subsample may be different. The summary label may be manually labeled, automatically labeled by an unsupervised learning algorithm, or a hybrid labeling algorithm that combines unsupervised learning algorithm and manual labeling.

[0044] Step 120: Input the training sub-samples in the training samples into the adversarial generator in the preset generative adversarial network model, and output the summary semantic vector and the perturbation summary semantic vector;

[0045] In this embodiment, the pre-defined generative adversarial network model includes an adversarial generator and a discriminator. Through adversarial learning, the adversarial generator and discriminator can effectively ensure that the same text generates different summaries based on different comments, and that the generated summaries will not fluctuate significantly due to minor changes in comments. At the same time, since the unsupervised summaries are derived from the text, the generated summaries can further enhance the retention of key information from the original text.

[0046] Specifically, for each training subsample, the text data sample and the text comment data sample are converted into an ID sequence according to the vocabulary. If the sequence length exceeds the length, it is truncated; otherwise, it is padded to ensure that the maximum length is the length (e.g., length = 512). It is then converted into a token-embedding vector, and a position encoding position-embedding is added. This vector is then input into the preset generative adversarial network model for training.

[0047] Figure 2 This is a schematic diagram of an adversarial learning generator provided in an embodiment of this application, such as... Figure 2 As shown, the text data samples and text comment data samples in the training subsamples are first encoded by a transformer encoder to obtain the summary semantic vector D(feat) and the perturbation summary semantic vector D(feat′). The training objective of the adversarial generator in the preset generative adversarial network model in this embodiment is to maximize the difference between the summary semantic vector D(feat) and the perturbation summary semantic vector D(feat′). L1 loss is used to calculate the distribution difference between vectors feat′ and feat. The adversarial generator is optimized through backpropagation to maximize the difference between the generated summary semantic vectors feat and feat′, thus completing the generator training. The generator loss function is calculated as follows:

[0048]

[0049] In the above embodiment, each training subsample generates two summary outputs: a normal summary semantic vector *feat* and a summary semantic vector *feat′* that has been perturbed by the generator. N training subsamples correspond to N sets of summary semantic vectors and perturbed summary semantic vectors. These are then fed as a whole into the discriminator in the generative adversarial network model.

[0050] Step 130: Input the summary semantic vector and perturbation summary semantic vector corresponding to the N training sub-samples into the discriminator in the preset generative adversarial network model, and output positive sample data and negative sample data. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by the training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples. N is a positive integer and M is a positive integer greater than 2.

[0051] Specifically, after the N sets of summary semantic vectors and perturbation summary semantic vectors corresponding to the N training subsamples are input into the discriminator, the discriminator can be further used to identify which summary semantic vectors and perturbation summary semantic vectors are generated based on the same training subsample and treat them as positive samples.

[0052] If it is a summary semantic vector and a perturbation summary semantic vector generated from M (M>2) training subsamples, then it means that it is a summary semantic vector and a perturbation summary semantic vector generated from different training subsamples, and it is regarded as a negative sample.

[0053] In this embodiment, the discriminator for adversarial learning can be optimized using InfoNCE loss backpropagation, i.e., the discriminator loss function is as follows:

[0054]

[0055] Where, k + k represents positive sample data. i This represents the set of positive and negative sample data, where τ is the adjustment coefficient.

[0056] In this embodiment, the purpose of the discriminator is to effectively identify positive and negative sample data. Figure 3 A schematic diagram of the adversarial learning framework provided in the embodiments of this application, such as... Figure 3 As shown, after the text comment data sample is processed by the adversarial generator, two summary semantic vectors and perturbation summary semantic vectors with the greatest possible difference will be generated. The discriminator can adversarially distinguish which summary semantic vectors and perturbation summary semantic vectors are generated from the same training subsample and which are generated from different training subsamples, thereby achieving adversarial learning.

[0057] Step 140: Based on the perturbed summary semantic vector in the positive sample data, output summary information samples;

[0058] Specifically, in the embodiments of this application, after the discriminator outputs positive sample data, the perturbed summary semantic vector in the positive sample data can be passed through a fully connected layer and then through a softmax layer to complete the output of the summary. The summary output by the model is optimized by relying on cross entropy so that its output is as close as possible to the summary label generated in unsupervised mode.

[0059] Step 150: Adjust the preset generative adversarial network model according to the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions to obtain the summary generation model;

[0060] The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector. The discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data. The cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples. The summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data.

[0061] Specifically, the preset training conditions described in the embodiments of this application may be satisfying a preset number of training times, or satisfying a preset training time, or the sum of generator loss, discriminator loss and cross-entropy loss may be lower than a preset loss value.

[0062] In this embodiment of the application, the sum of the generator loss, the discriminator loss, and the cross-entropy loss can specifically be:

[0063]

[0064] Where, min D L(G) is the generator loss, min D L(D) is the discriminator loss, min D L(S) is the cross-entropy loss.

[0065] In this embodiment of the application, the preset loss value can be pre-set. If the sum of the generator loss, the discriminator loss and the cross-entropy loss is less than the preset loss value, it indicates that the model has converged and has achieved a good training effect. Training can then be stopped to obtain the summary generation model.

[0066] In other embodiments, adversarial learning primarily ensures that the same text generates different summaries based on different comments, and that the generated summaries do not fluctuate significantly due to minor changes in the comments. Furthermore, because the unsupervised summaries originate from the text, they further enhance the retention of key information from the original text. However, unsupervised training models cannot fully guarantee high summarization and fluency. After adversarial learning, the model can be fine-tuned using a small amount of labeled summary data to make the generated summaries more natural and further improve their satisfaction. Specifically, after adversarial learning, a batch of training samples with manually labeled summaries can be obtained to retrain the model, ultimately resulting in a better-performing summary generation model.

[0067] In this embodiment of the application, after obtaining the summary generation model, the main text data and a corresponding main text comment data can be input into the summary generation model, which will output summary information. Even if it is the same main text data, different summary information can be output when it is combined with different main text comment data input into the model.

[0068] In this embodiment, a pre-defined generative adversarial network (GAN) model is trained using training sub-samples consisting of main text sample data and corresponding main text comment data samples. The generator in the model generates summary semantic vectors and perturbation summary semantic vectors with as large a difference as possible based on the training sub-samples. The discriminator in the model tries to group the summary semantic vectors and perturbation summary semantic vectors generated from the same training sub-sample together as positive sample data, and treats the data generated from different training samples as negative sample data. This allows the generator and discriminator to optimize against each other. Furthermore, the model is further adjusted by comparing the summary labels in the training sub-samples with the summary information samples generated from the summary semantic vectors in the positive sample data. The trained summary generation model can generate different summary information according to different main text comment data corresponding to the main text data, and the generated summary information will not fluctuate significantly due to small changes in the main text data, ensuring the reliability of the generated summary information.

[0069] Optionally, the step of inputting the training sub-samples from the training samples into the adversarial generator in a preset generative adversarial network model, and outputting a summary semantic vector and a perturbed summary semantic vector, includes:

[0070] The text data samples and the text comment data samples are encoded by an encoder to obtain the hidden state of the text data samples and the hidden state of the text comment data samples;

[0071] The hidden state of the text comment data sample is perturbed to obtain a perturbed comment sequence;

[0072] The hidden states of the main text data samples and the hidden states of the main text comment data samples are input into the first decoder to generate a summary semantic vector. The hidden states of the main text data samples and the perturbation comment sequence are input into the second decoder to generate a perturbation summary semantic vector.

[0073] Specifically, the first decoder described in the embodiments of this application is a decoder specifically used to receive the hidden state of the main text data sample and the hidden state of the main text comment data sample, and the second decoder is a decoder specifically used to receive the hidden state of the main text data sample and the perturbed comment sequence. The first decoder and the second decoder are different decoders.

[0074] Specifically, the generative adversarial network model described in this embodiment adopts the mainstream transformer model architecture. However, in order to better integrate comment information, a 6-layer transformer encoder is first used to extract semantic information of the main text and comments, and then a summary is generated by relying on the transformer decoder. Figure 4 A schematic diagram illustrating the changes in the decoder structure provided in the embodiments of this application, as shown below. Figure 4 As shown, the improved decoder results are as follows: The original decoder for each layer consists of three structures arranged sequentially: Masked MHA, Cross MHA, and Feed Forward. In Cross MHA, the query comes from the decoder output of the previous layer, and the key and value come from the encoder output. In the 6-layer decoder, the key and value of Cross MHA in layers 1, 3, and 5 still come from the encoder output of the hidden state of the text data sample and the hidden state of the text comment data sample. The key and value of Cross MHA in layers 2, 4, and 6 come from the encoder output of the hidden state of the text data sample and the perturbed comment sequence. The 6-layer decoder is just an example, and the number of layers in this embodiment of the application is not limited.

[0075] In this embodiment, by using a cross-attention mechanism to decode perturbed and non-perturbed data respectively for the first decoder and the second decoder, the comment information is explicitly used to guide the generation of the summary, so that the generated summary can better reflect the comment's viewpoint.

[0076] Optionally, the step of perturbing the hidden state of the text comment data samples to obtain a perturbed comment sequence includes:

[0077] The hidden states of the main text comment data samples are processed using a long short-term memory network to obtain comment sequence vectors;

[0078] The target element in the comment sequence vector is masked to obtain a mask sequence, wherein the target element is the element in the comment sequence vector whose sorting rank exceeds a preset rank;

[0079] Based on the mask sequence and the hidden state of the text comment data samples, a perturbation comment sequence is obtained.

[0080] Specifically, in this embodiment of the application, the ranking refers to the ranking of the elements in the comment sequence vector from largest to smallest, and the ranking can be preset.

[0081] More specifically, the perturbation processing in the embodiments of this application, such as the hidden state e of the text comment data samples. comment =[e1,e2,e3…elength The size of the vector is length × 768, where length is the sequence length and 768 is the vector dimension; the hidden state e of the comment data sample. comment After processing by a Long Short-Term Memory network, we obtain [h1, h2, h3…h length ], its size is length×128, and the last sequence state, h length After passing through a fully connected layer, the result is transformed into a vector of length `length`, which can be considered a prediction of the importance of the comment sequence. The elements in this vector are sorted, and the top 3 are set to 0 to obtain a mask sequence, such as [1,0,1,0,…,1]. This mask sequence is then combined with the hidden state of the main comment text, `e`. comment Multiplying them together gives e′ comment Here e′ comment =[e1,0,e3,0,…,e length The final perturbation comment sequence e′ is obtained. comment and the original comment sequence e comment , specific e′ comment It can be expressed by the following formula:

[0082]

[0083] e′ comment =mask*e comment

[0084] In this embodiment of the application, perturbation processing can effectively obtain diverse summary semantic vectors, which facilitates subsequent adversarial learning.

[0085] Optionally, obtaining multiple sets of training samples includes:

[0086] The first training sample is preprocessed to obtain the second training sample, wherein each second training sample includes N groups of text comment data samples, and each group of text comment data samples includes: a text data sample and a text comment data sample corresponding to the text data sample;

[0087] Unsupervised summary analysis is performed on the N main text comment data sample groups to obtain the summary tags corresponding to each main text comment data sample group;

[0088] Each group of text comment data samples and the corresponding summary label of the text comment data sample group are used as a training sub-sample to obtain N training sub-samples;

[0089] Multiple training samples are obtained based on multiple sets of N training sub-samples.

[0090] Specifically, in this embodiment of the application, a batch of text data containing comments, such as news data containing comments, can be collected in advance.

[0091] After obtaining the text data containing comments, in order to avoid low-quality data with too little text or too few comments affecting the subsequent training of the model, we can further pre-filter it based on the number of words in the text data and the number of comments, and only retain the data whose text data has more than a preset number of words and whose corresponding comment data has more than a preset number of comments.

[0092] More specifically, in some embodiments, the above-mentioned data can be further compressed, including the text data and the comment data. For text data samples whose text and comment word counts exceed the length (e.g., 512), the text is compressed using TextRank to obtain the compressed text data and the comment data corresponding to the text data. The text data and the comment data corresponding to the text data are then used as the first training samples. The text data is used as the text data sample, and the comment data corresponding to the text data is used as the M initial comment data samples corresponding to the text data samples, thereby obtaining the first training samples.

[0093] In this embodiment of the application, since each text data sample corresponds to M initial text comment data samples, and many of these comments may express the same topic, in order to effectively reduce the amount of data for model analysis, a preprocessing method can be used to perform topic analysis on the initial text comment data samples, and only the content corresponding to the main topic can be retained to obtain the second training sample.

[0094] The second training sample described in this embodiment includes N groups of text comment data samples, each group of text comment data samples including: a text data sample and a text comment data sample corresponding to the text data sample.

[0095] The unsupervised summary analysis described in this application embodiment specifically refers to effectively selecting sentences as summaries from the main text comment data sample group by comparing the similarity of each sentence in the main text data sample with other sentences, and comparing each sentence with the corresponding main text comment data sample, thereby realizing an unsupervised summary tag generation method that can effectively reduce the workload of manual tagging.

[0096] More specifically, in this embodiment of the application, a corresponding summary label is configured for each group of text comment data samples, and then it is used as a training subsample. Since there are N groups of text comment data samples, N training subsamples can be obtained accordingly.

[0097] In this embodiment of the application, in order to ensure sufficient training data and achieve effective training of the model, multiple second training samples can be obtained, and each second training sample can be processed in the manner described above to obtain N training sub-samples corresponding to each second training sample, thereby obtaining multiple sets of training samples based on multiple second training samples.

[0098] In this embodiment of the application, a generative adversarial network is trained using multiple sets of training samples to obtain a trained summary generation model, which can effectively output summary information.

[0099] In this embodiment of the application, the training samples can be effectively screened and expanded after preprocessing, avoiding the impact of low-quality data on model training. Furthermore, through unsupervised summary analysis, summary labels can be effectively generated in an unsupervised manner, avoiding the tedious workload of manual labeling and effectively improving labeling efficiency.

[0100] Optionally, the preprocessing of the first training samples to obtain the second training samples includes:

[0101] Obtain the first training sample, wherein the first training sample includes: a text data sample and K initial text comment data samples corresponding to the text data sample;

[0102] The topic analysis is performed on the K initial text comment data samples to obtain at least one comment topic;

[0103] Based on the number of initial text comment data samples corresponding to each comment topic, determine the text comment data sample corresponding to the first training sample from the K initial text comment data samples;

[0104] Based on the text comment data sample corresponding to the first training sample and the text data sample, the second training data is obtained, where K is a positive integer.

[0105] Specifically, in this embodiment, each first training sample can refer to a text data sample and K initial text comment data samples corresponding to the text data sample. Since there may be many low-quality comments in these K initial text comment data samples that do not greatly help the model training, in order to avoid invalid training, they can be further filtered in this embodiment.

[0106] Specifically, in this embodiment of the application, topic analysis can be performed on the K initial text comment data samples of each first training sample. Specifically, topic analysis can be performed through a document topic generation model to extract the comment topic of each initial comment data sample. Since these K initial text comment data samples are all comments on a certain text data sample, there will inevitably be many initial text comment data samples with the same comment topic.

[0107] If the number of initial comment data samples corresponding to a comment topic is high, it indicates that the comment topic can better reflect the thoughts of most users, and it also indicates that these initial text comment data samples can often better reflect the main content of the article. Therefore, in this embodiment of the application, the comment data sample corresponding to the first training sample will be determined from the K initial text comment data samples based on the number of initial text comment data samples corresponding to each comment topic.

[0108] Specifically, comment topics whose number of initial text comment data samples exceeds a preset number can be selected, and then the initial text comment data samples corresponding to the comment topics can be determined as the text comment data samples corresponding to the first training samples.

[0109] In this embodiment of the application, only the text data sample and the text comment data sample in each first training sample are retained, and then a data pair consisting of the text data sample and the text comment data sample is obtained, and finally the second training data is obtained.

[0110] In this embodiment of the application, by performing topic analysis on the initial text comment data samples, and then effectively filtering out text comment data samples with a high degree of consensus based on the number of initial comment data samples, it is possible to better filter out more valuable training samples and reduce the impact of low-quality samples on the model.

[0111] Optionally, unsupervised summary analysis is performed on the N text comment data samples to obtain summary tags corresponding to each text comment data sample, including:

[0112] Obtain the first similarity between each statement in the main text data sample and other statements, and obtain the second similarity between each statement and the main text comment data sample, wherein the other statements are all statements in the main text data sample except for the stated statement;

[0113] The target statement in each of the statements is used as the summary tag corresponding to the main text comment data sample, wherein the target statement is the statement whose sum of the first similarity and the second similarity is greater than a first preset threshold.

[0114] Specifically, the statement can be a complete statement in the text data sample, which can be determined by a period, or the statement can be segmented according to a specific model. This application embodiment does not limit this.

[0115] In this embodiment of the application, each statement in the main text data sample will be analyzed. Other statements refer to all other statements in the main text data sample except for the statement being analyzed.

[0116] In this embodiment, the statement is analyzed for similarity with all other statements to obtain the first similarity between each statement and other statements, and the second similarity between each statement and the text comment data sample is obtained. Based on the sum of the first similarity and the second similarity, the statement with the largest sum of similarity in the text data sample is effectively selected as the target statement, and the target statement is used as the summary tag corresponding to the text comment data sample.

[0117] In the embodiments of this application, the calculation of the first similarity and the second similarity can be based on the repetition of characters in the text, or it can be determined by analyzing the text topic and then judging based on the topic similarity.

[0118] In this embodiment of the application, by using the first similarity between each sentence in the main text data sample and other sentences, and the second similarity between each sentence and the comment data sample, the target sentences with the highest similarity to other content in the main text data sample can be effectively screened out. The higher frequency of occurrence indicates that the target sentence is likely to be the key content of this main text data sample. Therefore, the target sentence can be used as a summary tag, which can effectively realize automatic tag generation and improve tagging efficiency.

[0119] Figure 5 This is a schematic diagram of the abstract generation process provided in the embodiments of this application, such as... Figure 5 As shown, the process includes: Step 510, collecting and organizing news data containing comments (this news data can also be other types of data); Step 520, compressing the main text and comments; Step 530, analyzing comment viewpoints to generate main text-comment pairs; Step 540, extracting summaries using unsupervised methods to form main text-comment-summary pairs; Step 550, building the model; Step 560, training the model using adversarial learning; Step 570, fine-tuning the model; Step 580, model prediction and analysis of results.

[0120] Optionally, after the model training is complete, it can be applied in real life. Taking the news article "Looking at the shortcomings of the women's volleyball team's defeat, you can't win a game with just spirit" as an example, the text reads: "In the final match of the second week of the volleyball league, the two traditional teams A and B met for the first time in the new competition cycle. Both teams had new coaches and brand-new lineups, but faced different situations: one was the only team that had maintained a perfect record and was in high spirits, while the other had just suffered a crushing defeat against team C and was in poor form. In this tough battle, the A women's volleyball team, which lost 1-3, was not without a chance, but the one who laughed last was still the B women's volleyball team..."; the commentary reads: "Team B is also a brand-new team. There are not many players over 1.80 meters tall, and the average height is shorter than that of A. Why did they win all eight games? They dared to play, had a good mental state, and played freely..."; the generated summary reads: "The B women's volleyball team defeated the A women's volleyball team 3-1. Compared with the B women's volleyball team, which plays as a team, this A women's volleyball team had obvious shortcomings in technical and tactical coordination, insufficient ability to score key points, most players lacked basic skills, were timid on the court, and their fighting spirit needs to be strengthened."

[0121] The model training method provided in this application can be executed by a model training device. This application uses an example of a model training device executing the model training method to illustrate the model training device provided in this application.

[0122] Figure 6 This is a schematic diagram of the model training device structure provided in the embodiments of this application, such as... Figure 6 As shown, it includes:

[0123] The acquisition module 610 is used to acquire multiple sets of training samples. The training samples include N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same.

[0124] The first output module 620 is used to input the training sub-samples in the training samples into the adversarial generator in the preset generative adversarial network model, and output the summary semantic vector and the perturbation summary semantic vector.

[0125] The second output module 630 is used to input the summary semantic vectors and perturbation summary semantic vectors corresponding to the N training sub-samples into the discriminator in the preset generative adversarial network model, and output positive sample data and negative sample data. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by the training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples, where N is a positive integer and M is a positive integer greater than 2.

[0126] The third output module 640 is used to output a summary information sample based on the perturbation summary semantic vector in the positive sample data;

[0127] The adjustment module 650 is used to adjust the preset generative adversarial network model according to the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thereby obtaining the summary generation model.

[0128] The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector. The discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data. The cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples. The summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data.

[0129] Optionally, the first output module is specifically used for:

[0130] The text data samples and the text comment data samples are encoded by an encoder to obtain the hidden state of the text data samples and the hidden state of the text comment data samples;

[0131] The hidden state of the text comment data sample is perturbed to obtain a perturbed comment sequence;

[0132] The hidden states of the main text data samples and the hidden states of the main text comment data samples are input into the first decoder to generate a summary semantic vector. The hidden states of the main text data samples and the perturbation comment sequence are input into the second decoder to generate a perturbation summary semantic vector.

[0133] Optionally, the first output module is specifically used for:

[0134] The hidden states of the main text comment data samples are processed using a long short-term memory network to obtain comment sequence vectors;

[0135] The target element in the comment sequence vector is masked to obtain a mask sequence, wherein the target element is the element in the comment sequence vector whose sorting rank exceeds a preset rank;

[0136] Based on the mask sequence and the hidden state of the text comment data samples, a perturbation comment sequence is obtained.

[0137] Optionally, the acquisition module is specifically used for:

[0138] The first training sample is preprocessed to obtain the second training sample, wherein each second training sample includes N text comment data samples, and each text comment data sample includes: a text data sample and a text comment data sample corresponding to the text data sample;

[0139] Unsupervised summary analysis is performed on the N text comment data samples to obtain the summary tag corresponding to each text comment data sample;

[0140] Each of the main text comment data samples and the corresponding summary label of the main text comment data sample is used as a training sub-sample to obtain N training sub-samples;

[0141] Multiple training samples are obtained based on multiple sets of N training sub-samples.

[0142] Optionally, the acquisition module is specifically used for:

[0143] Obtain the first training sample, wherein the first training sample includes: a text data sample and K initial text comment data samples corresponding to the text data sample;

[0144] The topic analysis is performed on the K initial text comment data samples to obtain at least one comment topic;

[0145] Based on the number of initial comment data samples corresponding to each comment topic, determine the text comment data sample corresponding to the first training sample from the K initial text comment data samples;

[0146] Based on the text comment data sample corresponding to the first training sample and the text data sample, the second training data is obtained, where K is a positive integer.

[0147] Optionally, the acquisition module is specifically used for:

[0148] Obtain the first similarity between each statement in the main text data sample and other statements, and obtain the second similarity between each statement and the main text comment data sample, wherein the other statements are all statements in the main text data sample except for the stated statement;

[0149] The target statement in each of the statements is used as the summary tag corresponding to the main text comment data sample, wherein the target statement is the statement whose sum of the first similarity and the second similarity is greater than a first preset threshold.

[0150] In this embodiment, a pre-defined generative adversarial network (GAN) model is trained using training sub-samples consisting of main text sample data and corresponding main text comment data samples. The generator in the model generates summary semantic vectors and perturbation summary semantic vectors with as large a difference as possible based on the training sub-samples. The discriminator in the model tries to group the summary semantic vectors and perturbation summary semantic vectors generated from the same training sub-sample together as positive sample data, and treats the data generated from different training samples as negative sample data. This allows the generator and discriminator to optimize against each other. Furthermore, the model is further adjusted by comparing the summary labels in the training sub-samples with the summary information samples generated from the summary semantic vectors in the positive sample data. The trained summary generation model can generate different summary information according to different main text comment data corresponding to the main text data, and the generated summary information will not fluctuate significantly due to small changes in the main text data, ensuring the reliability of the generated summary information.

[0151] The summary generation device in this application embodiment can be an electronic device or a component within an electronic device, such as an integrated circuit or a chip. The electronic device can be a terminal or other devices besides a terminal. For example, the electronic device can be a mobile phone, tablet computer, laptop computer, PDA, in-vehicle electronic device, mobile internet device (MID), augmented reality (AR) / virtual reality (VR) device, robot, wearable device, ultra-mobile personal computer (UMPC), netbook, or personal digital assistant (PDA), etc. It can also be a server, network attached storage (NAS), personal computer (PC), television set (TV), ATM, or self-service machine, etc. This application embodiment does not specifically limit the device.

[0152] The abstract generation device in this application embodiment can be a device with an operating system. This operating system can be Android, iOS, or other possible operating systems; this application embodiment does not specifically limit it.

[0153] The abstract generation device provided in this application embodiment can achieve... Figures 1 to 5 The various processes implemented in the method implementation examples will not be described again here to avoid repetition.

[0154] Optionally, Figure 7 This is a schematic diagram of the electronic device structure provided in the embodiments of this application, such as... Figure 7 As shown, this application embodiment also provides an electronic device 700, including a processor 701 and a memory 702. The memory 702 stores a program or instructions that can run on the processor 701. When the program or instructions are executed by the processor 701, they implement the various steps of the above-described summary generation method embodiment and can achieve the same technical effect. To avoid repetition, they will not be described again here.

[0155] It should be noted that the electronic devices in the embodiments of this application include the mobile electronic devices and non-mobile electronic devices described above.

[0156] Figure 8 A schematic diagram of the hardware structure of an electronic device to implement an embodiment of this application.

[0157] The electronic device 800 includes, but is not limited to, components such as: radio frequency unit 801, network module 802, audio output unit 803, input unit 804, sensor 805, display unit 806, user input unit 807, interface unit 808, memory 809, and processor 810.

[0158] Those skilled in the art will understand that the electronic device 800 may also include a power supply (such as a battery) for supplying power to various components. The power supply may be logically connected to the processor 810 through a power management system, thereby enabling functions such as managing charging, discharging, and power consumption through the power management system. Figure 8 The electronic device structure shown does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than shown, or combine certain components, or have different component arrangements, which will not be elaborated here.

[0159] The processor 810 is used to acquire multiple sets of training samples, each training sample containing N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same.

[0160] The training sub-samples in the training samples are input into the adversarial generator in the preset generative adversarial network model, and the output is a summary semantic vector and a perturbation summary semantic vector.

[0161] The summary semantic vectors and perturbation summary semantic vectors corresponding to the N training sub-samples are input into the discriminator in the preset generative adversarial network model, and positive sample data and negative sample data are output. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by one training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples, where N is a positive integer and M is a positive integer greater than 2.

[0162] Based on the perturbed summary semantic vector in the positive sample data, output summary information samples;

[0163] The preset generative adversarial network model is adjusted based on the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thus obtaining the summary generation model.

[0164] The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector. The discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data. The cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples. The summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data.

[0165] The processor 810 is used to encode the text data sample and the text comment data sample through an encoder to obtain the hidden state of the text data sample and the hidden state of the text comment data sample;

[0166] The hidden state of the text comment data sample is perturbed to obtain a perturbed comment sequence;

[0167] The hidden states of the main text data samples and the hidden states of the main text comment data samples are input into the first decoder to generate a summary semantic vector. The hidden states of the main text data samples and the perturbation comment sequence are input into the second decoder to generate a perturbation summary semantic vector.

[0168] The processor 810 is used to process the hidden state of the text comment data sample based on a long short-term memory network to obtain a comment sequence vector;

[0169] The target element in the comment sequence vector is masked to obtain a mask sequence, wherein the target element is the element in the comment sequence vector whose sorting rank exceeds a preset rank;

[0170] Based on the mask sequence and the hidden state of the text comment data samples, a perturbation comment sequence is obtained.

[0171] The processor 810 is used to preprocess the first training sample to obtain the second training sample. Each second training sample includes N groups of text comment data samples. Each group of text comment data samples includes: a text data sample and a text comment data sample corresponding to the text data sample.

[0172] Unsupervised summary analysis is performed on the N main text comment data sample groups to obtain the summary tags corresponding to each main text comment data sample group;

[0173] Each group of text comment data samples and the corresponding summary label of the text comment data sample group are used as a training sub-sample to obtain N training sub-samples;

[0174] Multiple training samples are obtained based on multiple sets of N training sub-samples.

[0175] The processor 810 is used to acquire a first training sample, wherein the first training sample includes: a text data sample and K initial text comment data samples corresponding to the text data sample;

[0176] The topic analysis is performed on the K initial text comment data samples to obtain at least one comment topic;

[0177] Based on the number of initial text comment data samples corresponding to each comment topic, determine the text comment data sample corresponding to the first training sample from the K initial text comment data samples;

[0178] Based on the text comment data sample corresponding to the first training sample and the text data sample, the second training data is obtained, where K is a positive integer.

[0179] The processor 810 is used to obtain a first similarity between each statement and other statements in the main text data sample, and to obtain a second similarity between each statement and the main text comment data sample, wherein the other statements are all statements in the main text data sample except for the stated statement;

[0180] The target statement in each of the statements is used as the summary tag corresponding to the main text comment data sample, wherein the target statement is the statement whose sum of the first similarity and the second similarity is greater than a first preset threshold.

[0181] In this embodiment, a pre-defined generative adversarial network (GAN) model is trained using training sub-samples consisting of main text sample data and corresponding main text comment data samples. The generator in the model generates summary semantic vectors and perturbation summary semantic vectors with as large a difference as possible based on the training sub-samples. The discriminator in the model tries to group the summary semantic vectors and perturbation summary semantic vectors generated from the same training sub-sample together as positive sample data, and treats the data generated from different training samples as negative sample data. This allows the generator and discriminator to optimize against each other. Furthermore, the model is further adjusted by comparing the summary labels in the training sub-samples with the summary information samples generated from the summary semantic vectors in the positive sample data. The trained summary generation model can generate different summary information according to different main text comment data corresponding to the main text data, and the generated summary information will not fluctuate significantly due to small changes in the main text data, ensuring the reliability of the generated summary information.

[0182] It should be understood that, in this embodiment, the input unit 804 may include a graphics processing unit (GPU) 8041 and a microphone 8042. The GPU 8041 processes image data of still images or videos obtained by an image capture device (such as a camera) in video capture mode or image capture mode. The display unit 806 may include a display panel 8061, which may be configured in the form of a liquid crystal display, an organic light-emitting diode, or the like. The user input unit 807 includes at least one of a touch panel 8071 and other input devices 8072. The touch panel 8071 is also called a touch screen. The touch panel 8071 may include a touch detection device and a touch controller. Other input devices 8072 may include, but are not limited to, physical keyboards, function keys (such as volume control buttons, power buttons, etc.), trackballs, mice, and joysticks, which will not be described in detail here.

[0183] The memory 809 can be used to store software programs and various data. The memory 809 may primarily include a first storage area for storing programs or instructions and a second storage area for storing data. The first storage area may store the operating system, application programs or instructions required for at least one function (such as sound playback, image playback, etc.). Furthermore, the memory 809 may include volatile memory or non-volatile memory, or both. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM), and direct memory bus RAM (DRRAM). The memory 809 in the embodiments of this application includes, but is not limited to, these and any other suitable types of memory.

[0184] Processor 810 may include one or more processing units; optionally, processor 810 integrates an application processor and a modem processor, wherein the application processor mainly handles operations involving the operating system, user interface, and applications, and the modem processor mainly handles wireless communication signals, such as a baseband processor. It is understood that the aforementioned modem processor may also not be integrated into processor 810.

[0185] This application also provides a readable storage medium storing a program or instructions. When the program or instructions are executed by a processor, they implement the various processes of the above-described model training method embodiments and achieve the same technical effect. To avoid repetition, they will not be described again here.

[0186] The processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.

[0187] This application embodiment also provides a chip, which includes a processor and a communication interface. The communication interface is coupled to the processor. The processor is used to run programs or instructions to implement the various processes of the above-described model training method embodiments and can achieve the same technical effect. To avoid repetition, it will not be described again here.

[0188] It should be understood that the chip mentioned in the embodiments of this application may also be referred to as a system-on-a-chip, system chip, chip system, or system-on-a-chip, etc.

[0189] This application provides a computer program product, which is stored in a storage medium and executed by at least one processor to implement the various processes of the model training method embodiments described above, and can achieve the same technical effect. To avoid repetition, it will not be described again here.

[0190] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, it should be noted that the scope of the methods and apparatuses in the embodiments of this application is not limited to performing functions in the order shown or discussed, but may also include performing functions substantially simultaneously or in the reverse order, depending on the functions involved. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

[0191] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a computer software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0192] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. A model training method, characterized in that, include: Multiple sets of training samples are obtained. Each training sample contains N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same. The training sub-samples in the training samples are input into the adversarial generator in the preset generative adversarial network model, and the output is a summary semantic vector and a perturbation summary semantic vector. The summary semantic vector and the perturbation summary semantic vector are input into the discriminator in the preset generative adversarial network model, and positive sample data and negative sample data are output. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by one training subsample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training subsample, where N is a positive integer and M is a positive integer greater than 2. Based on the perturbed summary semantic vector in the positive sample data, output summary information samples; The preset generative adversarial network model is adjusted based on the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thus obtaining the summary generation model. The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector; the discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data; the cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples; and the summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data. The step of inputting the training sub-samples from the training samples into the adversarial generator in the preset generative adversarial network model and outputting the summary semantic vector and the perturbation summary semantic vector includes: The text data samples and the text comment data samples are encoded by an encoder to obtain the hidden state of the text data samples and the hidden state of the text comment data samples; The hidden state of the text comment data sample is perturbed to obtain a perturbed comment sequence; The hidden states of the main text data samples and the hidden states of the main text comment data samples are input into the first decoder to generate a summary semantic vector. The hidden states of the main text data samples and the perturbation comment sequence are input into the second decoder to generate a perturbation summary semantic vector. The step of perturbing the hidden state of the text comment data samples to obtain a perturbed comment sequence includes: The hidden states of the main text comment data samples are processed using a long short-term memory network to obtain comment sequence vectors; The target element in the comment sequence vector is masked to obtain a mask sequence, wherein the target element is the element in the comment sequence vector whose sorting rank exceeds a preset rank; Based on the mask sequence and the hidden state of the text comment data samples, a perturbation comment sequence is obtained.

2. The model training method according to claim 1, characterized in that, The acquisition of multiple sets of training samples includes: The first training sample is preprocessed to obtain the second training sample, wherein each second training sample includes N groups of text comment data samples, and each group of text comment data samples includes: a text data sample and a text comment data sample corresponding to the text data sample; Unsupervised summary analysis is performed on the N main text comment data sample groups to obtain the summary tags corresponding to each main text comment data sample group; Each group of text comment data samples and the corresponding summary label of the text comment data sample group are used as a training sub-sample to obtain N training sub-samples; Multiple training samples are obtained based on multiple sets of N training sub-samples.

3. The model training method according to claim 2, characterized in that, The preprocessing of the first training sample to obtain the second training sample includes: Obtain the first training sample, wherein the first training sample includes: a text data sample and K initial text comment data samples corresponding to the text data sample; The topic analysis is performed on the K initial text comment data samples to obtain at least one comment topic; Based on the number of initial text comment data samples corresponding to each comment topic, determine the text comment data sample corresponding to the first training sample from the K initial text comment data samples; Based on the text comment data sample corresponding to the first training sample and the text data sample, the second training data is obtained, where K is a positive integer.

4. The model training method according to claim 2, characterized in that, Unsupervised summary analysis is performed on the N text comment data samples to obtain summary tags corresponding to each text comment data sample, including: Obtain the first similarity between each statement in the main text data sample and other statements, and obtain the second similarity between each statement and the main text comment data sample, wherein the other statements are all statements in the main text data sample except for the stated statement; The target statement in each of the statements is used as the summary tag corresponding to the main text comment data sample, wherein the target statement is the statement whose sum of the first similarity and the second similarity is greater than a first preset threshold.

5. A model training device, characterized in that, include: The acquisition module is used to acquire multiple sets of training samples. The training samples contain N training sub-samples. Each training sub-sample includes: a text data sample, a text comment data sample corresponding to the text data sample, and a summary tag. The text data samples in the N training sub-samples are the same. The first output module is used to input the training sub-samples in the training samples into the adversarial generator in the preset generative adversarial network model, and output the summary semantic vector and the perturbation summary semantic vector. The second output module is used to input the summary semantic vector and perturbation summary semantic vector corresponding to the N training sub-samples into the discriminator in the preset generative adversarial network model, and output positive sample data and negative sample data. The positive sample data includes a summary semantic vector and a perturbation summary semantic vector generated by one training sub-sample, and the negative sample data includes M summary semantic vectors and perturbation summary semantic vectors generated by the training sub-samples, where N is a positive integer and M is a positive integer greater than 2. The third output module is used to output a summary information sample based on the perturbed summary semantic vector in the positive sample data; An adjustment module is used to adjust the preset generative adversarial network model based on the generator loss, discriminator loss and cross-entropy loss until the preset generative adversarial network model meets the preset training conditions, thereby obtaining a summary generation model. The generator loss is determined based on the generator loss function of the adversarial generator and the summary semantic vector and the perturbed summary semantic vector; the discriminator loss is determined based on the discriminator loss function of the discriminator and the positive sample data and the negative sample data; the cross-entropy loss is determined based on the cross-entropy loss function and the summary labels of the summary information samples and the training sub-samples; and the summary generation model is used to output summary information based on the text data and a comment data corresponding to the text data. Specifically, the first output module is used for: The text data samples and the text comment data samples are encoded by an encoder to obtain the hidden state of the text data samples and the hidden state of the text comment data samples; The hidden state of the text comment data sample is perturbed to obtain a perturbed comment sequence; The hidden states of the main text data samples and the hidden states of the main text comment data samples are input into the first decoder to generate a summary semantic vector. The hidden states of the main text data samples and the perturbation comment sequence are input into the second decoder to generate a perturbation summary semantic vector. Specifically, the first output module is used for: The hidden states of the main text comment data samples are processed using a long short-term memory network to obtain comment sequence vectors; The target element in the comment sequence vector is masked to obtain a mask sequence, wherein the target element is the element in the comment sequence vector whose sorting rank exceeds a preset rank; Based on the mask sequence and the hidden state of the text comment data samples, a perturbation comment sequence is obtained.

6. The model training apparatus according to claim 5, characterized in that, The acquisition module is specifically used for: The first training sample is preprocessed to obtain the second training sample, wherein each second training sample includes N text comment data samples, and each text comment data sample includes: a text data sample and a text comment data sample corresponding to the text data sample; Unsupervised summary analysis is performed on the N text comment data samples to obtain the summary tag corresponding to each text comment data sample; Each of the main text comment data samples and the corresponding summary label of the main text comment data sample is used as a training sub-sample to obtain N training sub-samples; Multiple training samples are obtained based on multiple sets of N training sub-samples.

7. The model training apparatus according to claim 6, characterized in that, The acquisition module is specifically used for: Obtain the first training sample, wherein the first training sample includes: a text data sample and K initial text comment data samples corresponding to the text data sample; The topic analysis is performed on the K initial text comment data samples to obtain at least one comment topic; Based on the number of initial comment data samples corresponding to each comment topic, determine the text comment data sample corresponding to the first training sample from the K initial text comment data samples; Based on the text comment data sample corresponding to the first training sample and the text data sample, the second training data is obtained, where K is a positive integer.

8. The model training apparatus according to claim 6, characterized in that, The acquisition module is specifically used for: Obtain the first similarity between each statement in the main text data sample and other statements, and obtain the second similarity between each statement and the main text comment data sample, wherein the other statements are all statements in the main text data sample except for the stated statement; The target statement in each of the statements is used as the summary tag corresponding to the main text comment data sample, wherein the target statement is the statement whose sum of the first similarity and the second similarity is greater than a first preset threshold.