An attention mechanism-based multi-task review text classification method and system
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- DALIAN NEUSOFT UNIV OF INFORMATION
- Filing Date
- 2026-02-13
- Publication Date
- 2026-06-19
Smart Images

Figure CN122240834A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of natural language processing technology, and in particular to an adversarial multi-task comment text classification method and system based on an attention mechanism. Background Technology
[0002] With the rapid development of internet technology, comment text data has exploded, making efficient and accurate classification of comment text an important research direction in the field of natural language processing. Traditional text classification methods mainly rely on manually formulated rules or the use of single-task learning models, which suffer from problems such as high human error rates, low computational efficiency, and severe data sparsity.
[0003] Multi-task learning, as an effective learning method that utilizes shared information among multiple related tasks, has been widely applied in the field of comment text classification. However, existing multi-task comment text classification models still have two key problems: first, they cannot effectively extract comment text features, especially when dealing with complex and diverse comment text data; second, they tend to overlook the importance of input information and cannot distinguish between words that are key to the classification task and irrelevant words in the comment text.
[0004] In existing technologies, the shared-private model provides two feature spaces for any learning task: a shared feature space storing shared features and a task-specific space storing task-specific features. However, in this model, the shared feature space may contain some task-specific features, while the task-specific feature space may contain some features that should have been shared, leading to a chaotic situation in feature learning. Furthermore, traditional multi-task learning models cannot effectively focus on key information in comment text, resulting in limited classification accuracy. Summary of the Invention
[0005] This invention discloses an adversarial multitasking comment text classification method and system based on an attention mechanism to overcome the above-mentioned technical problems.
[0006] To achieve the above objectives, the technical solution of the present invention is as follows:
[0007] An adversarial multi-task comment text classification method based on an attention mechanism includes the following steps: S1. Obtain multiple comment text data; the multiple comment text data are comment text data corresponding to multiple tasks in different fields; S2. Preprocess the comment text data to convert the comment text into a word vector representation sequence; obtain the word vector representation sequence of the comment text data; S3. Based on the word vector representation sequence of the comment text data, obtain the weighted text feature representation using an attention mechanism; S4. An adversarial multi-task learning model is adopted. Based on the weighted text feature representation, the private feature representation and shared feature representation of the comment text data are obtained to obtain the fused feature representation, and then the probability of the category to which the comment text belongs is obtained. Based on the classification loss function and the adversarial loss function, the trained adversarial multi-task learning model is obtained. S5: Based on the trained adversarial multi-task learning model, obtain the probability of the category to which the comment text to be classified belongs, so as to classify the comment text to be classified.
[0008] Furthermore, S3 includes: S31: Based on the word vector representation of the comment text data, obtain the hidden state representation of the comment text sample using the following formula:
[0009] In the formula: For the first The vector representation sequence of the nth comment text data. The hidden state representation is represented by word vectors; It is the hyperbolic tangent activation function; The weight matrix is trainable. Indicates the first The vector representation sequence of the nth comment text data. Each word vector represents a word; An index for the sample comment text; The index of the word vector representation in the sequence of vector representations of the comment text data; It is the bias vector; S32: Obtain attention weights based on the hidden state representation of the comment text sample;
[0010] In the formula: For the first The vector representation sequence of the nth comment text data. The attention weights are represented by word vectors; exp is an exponential function; This is a transpose operation; Let L be the trainable context vector; L is the first... The vector representation of each comment text data is the total number of word vector representations in the sequence; S33: Based on the attention weights, obtain the weighted text feature representation using the following formula:
[0011] In the formula: This represents the weighted text features.
[0012] Furthermore, the classification loss function is expressed as follows:
[0013] In the formula: Let be the classification loss value; For task indexing; Total number of tasks; For the first The number of samples for the i-th task, i.e., the number of samples for the j-th task The total number of comment texts contained in the training data for each task; For the first The first task The true sentiment category labels of each comment text sample are used as a supervisory signal to optimize the classification loss. For the first The first task The predicted probability of a sample of comment text.
[0014] Furthermore, the adversarial loss function is expressed as follows:
[0015] In the formula: To counteract the loss value; To share the parameters of the feature extraction module; For balance parameters; These are the parameters for the task discriminator; For task indexing; Total number of tasks; For the first The first task The real task source labels of a sample comment text are used as supervisory signals to drive the process. Task discriminator in the middle; For task discriminator; For feature extractors; For the first Number of samples per task; An index for the sample comment text; This represents the parameters for minimizing the shared feature extractor; This represents the parameters for maximizing the task discriminator.
[0016] Furthermore, S2 includes: S21: Based on the comment text data, use the Stanford word segmentation tool to obtain the word sequence; S22: Use a pre-trained Word2Vec model to convert each word in the word sequence into a fixed-dimensional word vector representation to obtain a sequence of word vector representations for the comment text data; S23: Based on the word vector representation sequence of the comment text data, obtain a word vector representation sequence of the comment text data of a set length so that the length of the word vector representation sequence of the comment text data in the batch is the same.
[0017] Furthermore, the comment text data includes comment text data corresponding to the sentiment classification task for product reviews and comment text data corresponding to the sentiment classification task for movie reviews.
[0018] Furthermore, after S22, the method further includes establishing a vocabulary list consisting of a set of non-repeating words based on the word sequence.
[0019] Furthermore, S5 includes: S51, obtaining the word sequence corresponding to the comment text to be classified based on the comment text to be classified; S52: If the words in the word sequence corresponding to the comment text to be classified exist in the vocabulary, then directly obtain the word vector representation of the words in the word sequence corresponding to the comment text to be classified; If the words in the word sequence corresponding to the comment text to be classified do not exist in the vocabulary, then the word vector representations of the words in the word sequence corresponding to the comment text to be classified are randomly initialized; S53: Obtain the word vector representation sequence of the comment text data to be classified for a set length. S54: Based on the word vector representation sequence of the comment text data to be classified, obtain the weighted text feature representation using an attention mechanism; then, using a trained adversarial multi-task learning model, obtain the probability of the category to which the comment text belongs, thereby achieving the classification of the comment text.
[0020] A classification system based on an attention mechanism-based adversarial multi-task comment text classification method includes: a data preprocessing module, an attention mechanism module, an adversarial multi-task learning module, a classification output module, and a parameter optimization module; The data preprocessing module is used to preprocess the comment text data to obtain the word vector representation sequence of the comment text data; The attention mechanism module is used to obtain a weighted text feature representation based on the word vector representation sequence of the comment text data; The adversarial multi-task learning module is used to obtain the private and shared feature representations of the comment text data based on the weighted text feature representation; The classification output module is used to obtain a fused feature representation based on private feature representation and shared feature representation, and then obtain the probability of the category to which the comment text belongs; The parameter optimization module is used to obtain the trained adversarial multi-task learning model based on the classification loss function and the adversarial loss function.
[0021] Beneficial Effects: This invention provides an adversarial multi-task comment text classification method and system based on an attention mechanism. Based on the attention mechanism, it obtains weighted text feature representations from the word vector representation sequence of comment text data. It then trains an adversarial multi-task learning model using a classification loss function and an adversarial loss function. The trained adversarial multi-task learning model is then used to obtain the probability of the comment text belonging to its category, thus achieving comment text classification. This invention addresses the problems of existing multi-task text classification models failing to effectively extract text features and easily overlooking the importance of input information. By combining the attention mechanism with adversarial multi-task learning, it first uses the attention mechanism to focus on important information reflecting text features in the comment text data. Then, through the adversarial learning mechanism, it can more clearly distinguish between shared and private features, thereby effectively improving the accuracy of multi-task comment text classification. It can be used for sentiment analysis and classification of comment texts in multiple fields such as product reviews and movie reviews. Attached Figure Description
[0022] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0023] Figure 1 This is a flowchart of the adversarial multitasking comment text classification method based on attention mechanism of the present invention; Figure 2 This is a system architecture diagram in an embodiment of the present invention; Figure 3 This is a schematic diagram of the attention mechanism in an embodiment of the present invention; Figure 4 This is a structural diagram of the adversarial multi-task learning model in an embodiment of the present invention. Detailed Implementation
[0024] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0025] This embodiment introduces an adversarial multi-task comment text classification method based on an attention mechanism, including the following steps: Figure 1 As shown: S1. Obtain multiple comment text data; the multiple comment text data are comment text data corresponding to multiple tasks in different fields; Specifically, the project acquires comment text data from multiple tasks across various domains, with each task corresponding to a different comment text classification task. In this embodiment, 16 text datasets are obtained from the raw dataset provided by Blitzer, including 14 Amazon product review datasets from different domains and 2 movie review datasets. Each task is a binary sentiment analysis task, classifying the comment text into positive or negative sentiment categories. The datasets are divided according to the ratio of training set to test set. The comment text data for each task is cleaned and standardized to remove irrelevant symbols and special characters, ensuring data quality.
[0026] S2. Preprocess the comment text data to convert the comment text into a fixed-dimensional word vector representation sequence; to obtain the word vector representation sequence of the comment text data; S21: Based on the comment text data, use the Stanford word segmentation tool to obtain the word sequence; Specifically, the comment text data is segmented into words, and the Stanford word segmentation tool is used to divide the continuous text into word sequences; S22: Use a pre-trained Word2Vec model to convert each word into a fixed-dimensional (300-dimensional) word vector representation to obtain a sequence of word vector representations for the comment text data; Following S22, the method further includes establishing a vocabulary list consisting of a set of non-repeating words based on the word sequence.
[0027] S23: Based on the word vector representation sequence of the comment text data, obtain a word vector representation sequence of the comment text data of a set length so that the length of the word vector representation sequence of the comment text data in the batch is the same.
[0028] This embodiment ensures that the text sequence length is consistent within a batch by padding or truncating the word vector representations of comment text data of different lengths.
[0029] Specifically, to ensure efficient batch training and inference for deep learning models, the word vector representations of comment text data with varying lengths need to be adjusted to a uniform fixed length. The specific steps are as follows: Based on the statistical distribution of text lengths in the training dataset, a fixed sequence length is preset. For sequences shorter than the fixed length, a specific number of word vector representations representing padding are added to the end until the fixed length is reached; for sequences longer than the fixed length, they are truncated from the end, retaining only the fixed-length word vector representations. All samples are converted into regular tensors, which can be directly input into the subsequent neural network.
[0030] Specifically, through preprocessing, the original comment text is converted into a sequence of word vectors suitable for neural network processing, laying the foundation for subsequent feature extraction.
[0031] This embodiment preprocesses the acquired comment text by segmenting it into word sequences using a word segmentation tool; it establishes a vocabulary and constructs a dictionary, mapping each word to a unique certificate index; it converts each word into a fixed-dimensional word vector representation; and it pads or truncates the word vector representation sequences corresponding to comment texts of different lengths to ensure consistent text sequence lengths within a batch. This completes the preprocessing of the original comment text data. The unstructured original text is systematically converted into regularized, numerical word vector representation tensors, thus providing directly processable high-dimensional input data for subsequent neural network-based attention mechanisms and adversarial multi-task learning.
[0032] S3. Based on the word vector representation sequence of the comment text data, obtain the weighted text feature representation using an attention mechanism; Specifically, through the like Figure 3 The attention mechanism shown here weights the word vector representation sequence, calculates the importance weight of each word, and obtains a weighted text feature representation. This enables the model to focus on keywords that contribute to sentiment classification and effectively improves the ability to extract text features.
[0033] Preferably, S3 includes: S31: As Figure 3 As shown, the hidden state representation of the comment text sample is obtained based on the word vector representation of the comment text data, using the following formula:
[0034] In the formula: For the first The vector representation sequence of the nth comment text data. The hidden state representation is represented by word vectors; It is the hyperbolic tangent activation function; The weight matrix is trainable. Indicates the first The vector representation sequence of the nth comment text data. The word vector representation, that is, the first word vector. The input vector at each time step; An index for the sample comment text; The index of the word vector representation in the sequence of vector representations of the comment text data; It is the bias vector; S32: Obtain attention weights based on the hidden state representation of the comment text sample;
[0035] In the formula: For the first The vector representation sequence of the nth comment text data. The attention weights are represented by word vectors; exp is an exponential function; This is a transpose operation; Let L be the trainable context vector; L is the first... The vector representation of the comment text data represents the total number of word vector representations in the sequence, i.e., the nth comment text data. The vector representation of each comment text data point represents the total length of the sequence; S33: Based on the attention weights, obtain the weighted text feature representation using the following formula:
[0036] In the formula: This is the context vector, which is the weighted representation of text features.
[0037] S4. An adversarial multi-task learning model is adopted. Based on the weighted text feature representation, the private feature representation and shared feature representation of the comment text data are obtained to obtain the fused feature representation, and then the probability of the comment text category is obtained. Based on the classification loss function and the adversarial loss function, the trained adversarial multi-task learning model is obtained. Specifically, the weighted text feature representation is input into the adversarial multi-task learning model, such as... Figure 2 and Figure 4 As shown, the context vector obtained in S3 is used as input and fed into the adversarial multi-task learning model. Preferably, the adversarial multi-task learning model includes a shared feature extraction module, a task-specific feature extraction module, and a task discriminator; The shared feature extraction module includes a function to extract shared features across tasks and process input data from all tasks. Specifically, the shared feature extraction module is implemented using a Long Short-Term Memory (LSTM) network, specifically a two-layer cascaded LSTM structure with a hidden layer dimension of 100. This shared feature extraction module is shared across all tasks, and its input is the context vector generated in step S3. During training, samples from all tasks are mixed and then forward computed using the same set of LSTM parameters, aiming to learn and extract task-invariant cross-task shared feature representations from the mixed data.
[0038] To achieve purification of shared features, the model introduces a task discriminator to enable adversarial training. This task discriminator (such as...) Figure 2 and Figure 4The input (as shown) is the output of the shared feature extraction module. The task discriminator is implemented using a fully connected network, consisting of two fully connected layers. The input dimension is 100, and the output dimension is the number of tasks. It is used to determine which task the input features belong to, such as... Figure 2 As shown, the task discriminator receives the output of the shared feature extraction module, and the adversarial training path is indicated by the dashed arrow. The task-specific feature extraction module is equipped with an independent feature extraction unit for each task, used to extract task-specific features; Specifically, the task-specific feature extraction module is equipped with an independent LSTM network for each task, possessing completely independent network parameters that are not shared between them. Each private LSTM network only receives and processes the input data for its corresponding task, and is specifically used to extract the task-specific private features.
[0039] Specifically, in this embodiment, both the shared feature extraction module and the task-private feature extraction module are implemented using a Long Short-Term Memory (LSTM) network. The shared feature extraction module and the task-private feature extraction module work in parallel, outputting shared feature representations and task-private feature representations respectively; for example... Figure 2 As shown, the context vector generated by the attention mechanism is simultaneously fed to both the shared feature extraction module and the private feature extraction module corresponding to the current sample's task. These two modules are processed in parallel, outputting shared and private feature representations respectively. These two are then concatenated (vector concatenation), and the resulting fused feature representation is subsequently passed through a Dropout layer, a fully connected layer, and finally a Softmax layer, which outputs the probability distribution of the sentiment category, thus completing the classification task.
[0040] Specifically, this embodiment optimizes the shared feature extraction module and the task-private feature extraction module through adversarial training. It adopts a classification loss function to optimize task classification performance; it adopts an adversarial loss function to prevent the shared features extracted by the shared feature extraction module from being distinguished by the task discriminator; it adopts an alternating optimization strategy to first maximize the discriminator's loss and then minimize the adversarial loss of the shared feature extractor; through the above adversarial training process, it ensures that the features extracted by the shared feature extraction module have task invariance, and achieves a clear distinction between shared features and private features.
[0041] Preferably, the classification loss function is expressed as follows:
[0042] In the formula: The classification loss value is used to measure the difference between the model's prediction and the true label. The classification performance of the model is optimized by minimizing this loss. For task indexing; Total number of tasks; For the first The number of samples for the i-th task, i.e., the number of samples for the j-th task The total number of comment texts contained in the training data for each task; For the first The first task The true sentiment category labels of each comment text sample are used as a supervisory signal to optimize the classification loss. For the first The first task Predicted probability of a sample of comment text; Preferably, the adversarial loss function is expressed as follows:
[0043] In the formula: To counteract the loss value, the shared features are purified through optimization of this loss. To share the parameters of the feature extraction module; The balancing parameter is used to adjust the weights between the adversarial loss and the classification loss. These are the parameters for the task discriminator; For task indexing; Total number of tasks; For the first The first task The real task source labels of a sample comment text are used as supervisory signals to drive the process. The task discriminator in the code takes a value of 1 to indicate that it belongs to the task and 0 to indicate that it does not belong to the task. For task discriminator; For feature extractors; For the first Number of samples per task; An index for the sample comment text; This represents the parameters for minimizing the shared feature extractor, used to deceive the discriminator; This represents the parameters for maximizing the task discriminator, used to correctly identify task types. This embodiment employs an alternating optimization strategy: first, the shared feature extractor is fixed, maximizing the loss of the task discriminator so that the discriminator can accurately distinguish the task to which the feature belongs; then, the task discriminator is fixed, minimizing the adversarial loss of the shared feature extractor so that the extracted shared features cannot be distinguished by the task discriminator; for example... Figure 4 As shown, through the above adversarial training process, it is ensured that the features extracted by the shared feature extraction module have task invariance, that is, they do not contain task-specific information, thereby achieving a clear distinction between shared features and private features.
[0044] S5: Based on the trained adversarial multi-task learning model, obtain the probability of the category to which the comment text to be classified belongs, so as to classify the comment text to be classified.
[0045] Specifically, based on the trained adversarial multi-task learning model, multi-task comment text classification is performed. The final adversarial multi-task learning model, after optimization through the adversarial training process in step S4, is used to perform sentiment classification on new comment texts.
[0046] The multi-task text classification task in this embodiment includes, but is not limited to, sentiment classification of product reviews and sentiment classification of movie reviews.
[0047] The specific classification process includes the following steps: S51. Based on the comment text to be classified, perform preprocessing using the method described in S2 to obtain the word sequence corresponding to the comment text to be classified; S52: If the words in the word sequence corresponding to the comment text to be classified exist in the vocabulary, then directly obtain the word vector representation of the words in the word sequence corresponding to the comment text to be classified; If the words in the word sequence corresponding to the comment text to be classified do not exist in the vocabulary, then the word vector representations of the words in the word sequence corresponding to the comment text to be classified are randomly initialized; Specifically, for each word sequence obtained after word segmentation of the text sample to be commented on, it is converted into a corresponding word vector representation sequence based on the established vocabulary. For words already in the vocabulary, their pre-trained word vector representations are directly obtained. For words not in the vocabulary, which are usually derived from the test set, new text input by users in real-world applications, or new vocabulary during cross-domain transfer, randomized initialization of word vector representations is used for these out-of-vocabulary words to maintain the model's generalization ability, and fine-tuning is performed during training. A word vector representation sequence of varying lengths is generated for each comment text.
[0048] S53: Obtain the word vector representation sequence of the comment text data to be classified for a set length. S54: Based on the word vector representation sequence of the comment text data to be classified, obtain the weighted text feature representation using an attention mechanism; then, using a trained adversarial multi-task learning model, obtain the probability of the category to which the comment text belongs, thereby achieving the classification of the comment text.
[0049] Specifically, the context vector of the comment text data to be classified is used as input and fed into both the optimized shared feature extraction module and the optimized task-specific feature extraction module corresponding to the task to which the comment text belongs, thereby extracting the optimized shared feature representation and the task-specific feature representation respectively. Then, the extracted shared feature representation and task-specific feature representation are combined to form the final feature representation. The final feature representation is input into the classifier of the corresponding task, and the Softmax function in the classifier calculates and outputs the probability that the comment text belongs to each sentiment category, thus completing the classification.
[0050] like Figure 2 As shown, the system supports a multi-task simultaneous processing architecture. The input layer receives comment text input from different tasks. After processing by the attention mechanism and feature extraction module, the shared feature extraction module provides common feature representations for all tasks, while each task is processed by its independent private feature extraction module and classification path. Finally, the results are output by their respective task classifiers, thus ensuring cross-task knowledge sharing while preserving task specificity.
[0051] This embodiment also discloses a classification system for an adversarial multi-task comment text classification method based on an attention mechanism, including: a data preprocessing module, an attention mechanism module, an adversarial multi-task learning module, a classification output module, and a parameter optimization module; The data preprocessing module is used to preprocess the comment text data and obtain the word vector representation sequence of the comment text data. Specifically, it includes word segmentation of the input comment text data, dictionary construction, and word vector representation. The attention mechanism module is used to obtain a weighted text feature representation based on the word vector representation sequence of the comment text data, so as to calculate the importance weight of each word in the comment text; The adversarial multi-task learning module includes a shared feature extraction unit, a task-specific feature extraction unit, and a task discrimination unit; it is used to obtain the private feature representation and shared feature representation of the comment text data based on the weighted text feature representation. The classification output module is used to obtain a fused feature representation based on private feature representation and shared feature representation, and then obtain the probability of the category to which the comment text belongs; and to make multi-task classification decisions based on the extracted features. The parameter optimization module is used to obtain the trained adversarial multi-task learning model based on the classification loss function and the adversarial loss function, so as to optimize the model parameters through adversarial training.
[0052] This embodiment constructs a high-precision, highly generalizable automatic sentiment classification method for comment text by integrating attention mechanisms and adversarial multi-task learning. This addresses the core challenges of traditional models' insufficient capture of key text information and chaotic feature sharing across multiple tasks. It has broad application value in real-world production and daily life. For example, in e-commerce platforms and service evaluation systems, it can analyze massive amounts of product reviews and user feedback in real time, helping businesses accurately understand consumer needs and quickly locate product problems, thereby guiding product optimization and marketing decisions. In the cultural and entertainment field, it can summarize the sentiment of movie and book reviews, providing reference for public consumption choices, assisting platforms in achieving personalized recommendations, and improving user experience.
[0053] This embodiment first represents the input comment text using word vectors. Then, it calculates the weights of each word through an attention mechanism and sums them to obtain a context vector. This vector is then input into an adversarial multi-task learning model, which includes a shared feature extraction module, a task-specific feature extraction module, and a task discriminator. Through adversarial training, the features extracted by the shared feature extraction module cannot be distinguished by the task discriminator. Finally, the optimized features are used for multi-task comment text classification. This intelligent processing of text information liberates humans from tedious text review, significantly improving the efficiency of social information processing. It also promotes a virtuous cycle of "user feedback - product improvement," driving continuous optimization of consumer goods and services quality. Furthermore, it makes high-performance text analysis technology more easily applied to organizations and industries of different sizes, lowering the threshold for artificial intelligence applications and facilitating the digital transformation of society. Therefore, this embodiment is not only an algorithm improvement but also a practical technology that can effectively promote the development of business intelligence, improve human-computer interaction, and facilitate data-driven scientific decision-making, possessing positive social and economic significance.
[0054] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. An attention mechanism-based multi-task review text classification method, characterized in that, Includes the following steps: S1. Obtain multiple comment text data; the multiple comment text data are comment text data corresponding to multiple tasks in different fields; S2. Preprocess the comment text data to convert the comment text into a word vector representation sequence; obtain the word vector representation sequence of the comment text data; S3. Based on the word vector representation sequence of the comment text data, obtain the weighted text feature representation using an attention mechanism; S4. An adversarial multi-task learning model is adopted. Based on the weighted text feature representation, the private feature representation and shared feature representation of the comment text data are obtained to obtain the fused feature representation, and then the probability of the category to which the comment text belongs is obtained. Based on the classification loss function and the adversarial loss function, the trained adversarial multi-task learning model is obtained. S5: Based on the trained adversarial multi-task learning model, obtain the probability of the category to which the comment text to be classified belongs, so as to classify the comment text to be classified. 2.The attention mechanism-based multi-task review text classification method according to claim 1, wherein, S3 includes: S31: Based on the word vector representation of the comment text data, obtain the hidden state representation of the comment text sample using the following formula: where: is the hidden state representation for the th word vector representation in the sequence of vector representations for the th review text data; is the hyperbolic tangent activation function; is a trainable weight matrix; represents the th word vector representation in the sequence of vector representations for the th review text data; is the index of the review text sample; is the index of the word vector representation in the sequence of vector representations for the review text data; is a bias vector; S32: Obtain attention weights based on the hidden state representation of the comment text sample; In the formula: For the first The vector representation sequence of the nth comment text data. The attention weights are represented by word vectors; exp is an exponential function. This is a transpose operation; Let L be the trainable context vector; L is the first... The vector representation of each comment text data is the total number of word vector representations in the sequence; S33: Based on the attention weights, obtain the weighted text feature representation using the following formula: In the formula: This represents the weighted text features.
3. The adversarial multi-task comment text classification method based on attention mechanism according to claim 1, characterized in that, The classification loss function is expressed as follows: In the formula: Let be the classification loss value; For task indexing; Total number of tasks; For the first The number of samples for the i-th task, i.e., the number of samples for the j-th task The total number of comment texts contained in the training data for each task; For the first The first task The true sentiment category labels of each comment text sample are used as a supervisory signal to optimize the classification loss; For the first The first task The predicted probability of a sample of comment text.
4. The adversarial multi-task comment text classification method based on attention mechanism according to claim 1, characterized in that, The adversarial loss function is expressed as follows: In the formula: To counteract the loss value; To share the parameters of the feature extraction module; For balance parameters; These are the parameters for the task discriminator; For task indexing; Total number of tasks; For the first The first task The real task source labels of a sample comment text are used as supervisory signals to drive the process. Task discriminator in the middle; For task discriminator; For feature extractors; For the first Number of samples per task; An index for the sample comment text; This represents the parameters for minimizing the shared feature extractor; This represents the parameters for maximizing the task discriminator.
5. The adversarial multi-task comment text classification method based on attention mechanism according to claim 1, characterized in that, S2 includes: S21: Based on the comment text data, use the Stanford word segmentation tool to obtain the word sequence; S22: Use a pre-trained Word2Vec model to convert each word in the word sequence into a fixed-dimensional word vector representation to obtain a sequence of word vector representations for the comment text data; S23: Based on the word vector representation sequence of the comment text data, obtain a word vector representation sequence of the comment text data of a set length so that the length of the word vector representation sequence of the comment text data in the batch is the same.
6. The adversarial multi-task comment text classification method based on attention mechanism according to claim 1, characterized in that, The comment text data includes comment text data corresponding to the sentiment classification task for product reviews and comment text data corresponding to the sentiment classification task for movie reviews.
7. The adversarial multi-task comment text classification method based on attention mechanism according to claim 5, characterized in that, Following S22, the method further includes establishing a vocabulary list consisting of a set of non-repeating words based on the word sequence.
8. The adversarial multi-task comment text classification method based on attention mechanism according to claim 7, characterized in that, S5 includes: S51, obtaining the word sequence corresponding to the comment text to be classified based on the comment text to be classified; S52: If the words in the word sequence corresponding to the comment text to be classified exist in the vocabulary, then directly obtain the word vector representation of the words in the word sequence corresponding to the comment text to be classified; If the words in the word sequence corresponding to the comment text to be classified do not exist in the vocabulary, then the word vector representations of the words in the word sequence corresponding to the comment text to be classified are randomly initialized; S53: Obtain the word vector representation sequence of the comment text data to be classified for a set length. S54: Based on the word vector representation sequence of the comment text data to be classified, obtain the weighted text feature representation using an attention mechanism; then, using a trained adversarial multi-task learning model, obtain the probability of the category to which the comment text belongs, thereby achieving the classification of the comment text.
9. The classification system of the adversarial multi-task comment text classification method based on attention mechanism according to any one of claims 1-8, characterized in that, include: Data preprocessing module, attention mechanism module, adversarial multi-task learning module, classification output module, parameter optimization module; The data preprocessing module is used to preprocess the comment text data to obtain the word vector representation sequence of the comment text data; The attention mechanism module is used to obtain a weighted text feature representation based on the word vector representation sequence of the comment text data; The adversarial multi-task learning module is used to obtain the private and shared feature representations of the comment text data based on the weighted text feature representation; The classification output module is used to obtain a fused feature representation based on private feature representation and shared feature representation, and then obtain the probability of the category to which the comment text belongs; The parameter optimization module is used to obtain the trained adversarial multi-task learning model based on the classification loss function and the adversarial loss function.