Text sentiment analysis model training method, sentiment analysis method, device and medium

By introducing a cross-attention mechanism into the sentiment analysis model, the comment text and individualized information are integrated to form label weights, which solves the problem of existing systems ignoring individual bias and improves the accuracy of sentiment analysis.

CN116244435BActive Publication Date: 2026-06-16EAST CHINA UNIV OF SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
EAST CHINA UNIV OF SCI & TECH
Filing Date
2023-01-18
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing sentiment analysis systems ignore individual bias information, which prevents the models from accurately analyzing the sentiment categories contained in the text.

Method used

A cross-attention mechanism is used to fuse comment text with individualized information to form label weights, and the predicted sentiment labels are weighted to improve the accuracy of the model.

🎯Benefits of technology

By introducing a cross-attention mechanism, the generated label weights can better reflect the differences in individualized information, thus improving the accuracy of text sentiment category analysis.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116244435B_ABST
    Figure CN116244435B_ABST
Patent Text Reader

Abstract

The embodiment of the application relates to the field of text analysis, and discloses a text sentiment analysis model training method, a sentiment analysis method, equipment and a medium. In model training: the review text and at least one individualized information of the review text are respectively encoded to obtain a text vector and at least one individual vector; the text vector is input into a sentiment prediction model to obtain a predicted sentiment label of the text vector under multiple sentiment categories; a cross-attention model is used to fuse the text vector with each individual vector and then splice to obtain a label weight; a weight calculation network is used to perform weighted calculation on each predicted sentiment label by using the label weight, and linear regression is performed on the calculation result to obtain a sentiment category probability under the multiple sentiment categories. Due to the introduction of the cross-attention mechanism, the generated label weight has expression differences on individualized information of the review text, so that the accuracy of text sentiment category analysis of the model is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of text analysis technology, and in particular to a text sentiment analysis model training method, sentiment analysis method, device, and medium. Background Technology

[0002] As societal attitudes become more open, the pursuit of individuality amplifies individual biases. Sentiment analysis of textual commentary from different individuals should not be confined to a single, simplistic standard.

[0003] Current sentiment analysis systems often neglect individual bias information, causing models to focus only on the comment text itself. Even when considering relevant information beyond the comment text, they merely perform simple concatenation or cascading operations on the information, failing to truly integrate individual bias information into the word vectors. Consequently, the trained models cannot accurately analyze the sentiment categories contained in the text. Summary of the Invention

[0004] The purpose of this application is to provide a text sentiment analysis model training method, sentiment analysis method, device and medium. By adopting a cross-attention mechanism, the differences in sentiment expression of comment text under individual bias are fused to form label weights, and the predicted sentiment labels of comment text are weighted using these label weights, thereby improving the accuracy of the model in analyzing the sentiment category of text.

[0005] To address the aforementioned technical problems, this application provides a text sentiment analysis model training method, comprising: encoding comment text and at least one individualized piece of information of the comment text to obtain a text vector and an individual vector of the at least one individual; inputting the text vector into a sentiment prediction model to obtain predicted sentiment labels for the text vector under multiple sentiment categories; using a cross-attention model to fuse the text vector with each individual vector and then concatenate them to obtain label weights; using a weight calculation network to perform weighted calculation on each predicted sentiment label using the label weights, and performing linear regression on the calculation results to obtain the sentiment category probabilities under the multiple sentiment categories; and using a classification loss function to train the sentiment analysis model composed of the sentiment prediction model, the cross-attention model, and the weight calculation network.

[0006] The embodiments of this application also provide a text sentiment analysis method, comprising: encoding the comment text to be analyzed and at least one individualized information of the comment text to be analyzed to obtain a text vector to be analyzed and an individual vector of the at least one individual; inputting the text vector to be analyzed and the individual vector of the at least one individual into a trained sentiment analysis model to obtain the sentiment category to which the comment text to be analyzed belongs; wherein, the sentiment analysis model is obtained by the text sentiment analysis model training method described above.

[0007] Embodiments of this application also provide an electronic device, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the text sentiment analysis model training method mentioned in the above embodiments, or to perform the text sentiment analysis method mentioned in the above embodiments.

[0008] The embodiments of this application also provide a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the text sentiment analysis model training method mentioned in the above embodiments, or is capable of executing the text sentiment analysis method mentioned in the above embodiments.

[0009] The text sentiment analysis model training method provided in this application encodes the comment text and at least one individualized information of the comment text to obtain text vectors and at least one individual vector. The text vectors are then input into a sentiment prediction model to obtain predicted sentiment labels for the text vectors under multiple sentiment categories. A cross-attention model is used to fuse the text vectors with each individual vector and then concatenate them to obtain label weights. A weight calculation network is used to perform weighted calculations on each predicted sentiment label using the label weights, and the calculation results are subjected to linear regression to obtain sentiment category probabilities under multiple sentiment categories. A classification loss function is used to train the sentiment analysis model composed of the sentiment prediction model, the cross-attention model, and the weight calculation network. Because the introduction of the cross-attention mechanism allows the generated label weights to reflect the individualized information of the comment text, and the weighted calculations of the initially obtained predicted sentiment labels using these personalized label weights make the final predicted sentiment category probabilities more consistent with the true sentiment of different individuals, thereby improving the accuracy of the model in analyzing text sentiment categories. Attached Figure Description

[0010] One or more embodiments are illustrated by way of example with reference numerals in the accompanying drawings. These illustrations do not constitute a limitation on the embodiments. Elements with the same reference numerals in the drawings are denoted as similar elements. Unless otherwise stated, the figures in the drawings are not to be limited by scale.

[0011] Figure 1 This is a flowchart of the text sentiment analysis model training method provided in the embodiments of this application;

[0012] Figure 2 This is a schematic diagram of the structure of the sentiment analysis model provided in the embodiments of this application;

[0013] Figure 3 This is a flowchart of the text sentiment analysis method provided in the embodiments of this application;

[0014] Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0015] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the various embodiments of this application will be described in detail below with reference to the accompanying drawings. However, those skilled in the art will understand that many technical details have been presented in the various embodiments of this application to enable the reader to better understand this application. However, the technical solutions claimed in this application can be implemented even without these technical details and various changes and modifications based on the following embodiments.

[0016] One embodiment of this application relates to a method for training a text sentiment analysis model, such as... Figure 1 As shown, it includes:

[0017] Step 101: Encode the comment text and at least one individualized information of the comment text to obtain a text vector and at least one individual vector.

[0018] In this embodiment, the comment text used can be comments made by users on social media platforms, websites, and mass media regarding objects such as products, events, and news. These comments typically reflect the user's emotional category, such as happiness, anger, sadness, disappointment, or approval. The individualized information of the comment text can be individual information related to the comment text, such as the information of the commenting entity (which could be a user or organization) and / or the information of the object being commented on (which could be a product, event, or news). Depending on the analysis requirements, the individualized information of the obtained comment text can be classified at different levels to obtain at least one type of individualized information. For example, for comment text about products, the corresponding individualized information can be roughly divided into two categories: user information and product information. User information can include an individual's age, gender, personality, and preferences, or an organization's nature, size, and social influence; product information can include the product's name, price, purpose, and manufacturer.

[0019] Specifically, after obtaining the comment text and at least one type of individualized information from the comment text, the comment text and each type of individualized information (here, "information" can also be presented in "text" form) can be encoded separately, converting the text into vectors to obtain the text vector corresponding to the comment text and the individual vector corresponding to each type of individualized information. The specific form of the text can be text edited from natural language such as Chinese characters or letters. This embodiment does not limit the specific encoding method for converting the text into vectors; for example, natural language processing methods such as Bert and Roberta can be used for encoding.

[0020] To reduce the size of the encoded vector, this embodiment uses word embedding encoding to encode the comment text and personalized information separately, as follows:

[0021] Step 1: Perform word embedding encoding on the comment text, and use the resulting word embedding matrix as the text vector;

[0022] Step 2: Perform word embedding encoding on the at least one individualized information respectively, and take each word embedding matrix in the obtained word embedding matrix of the at least one as an individual vector.

[0023] Specifically, an input layer, also known as a word embedding layer, can be set at the input end of the sentiment analysis model to be trained. This layer maps the comment text and its individualized information into an n-dimensional vector space, generating an n-dimensional vector space that represents the semantic relationships of the text that the model can understand. For example, for product reviews, the comment text that needs sentiment analysis can be defined as: W = {W1, W2, ..., W...} m}, where m is the number of words in the review text; user information is denoted as user_id, and product information as product_id (user information and product information are two types of individualized information for product reviews). The product review text, user information, and product information are sequentially transformed through the WordEmbedding layer, and the corresponding output words are embedded into the matrix E. w E u and E p These are respectively used as text vectors, individual vectors corresponding to individual users, and individual vectors corresponding to individual products.

[0024] Step 102: Input the text vector into the sentiment prediction model to obtain the predicted sentiment labels of the text vector under multiple sentiment categories.

[0025] like Figure 2 As shown, the sentiment analysis model to be trained in this embodiment includes three parts: a sentiment prediction model, a cross-attention model, and a weight calculation network.

[0026] The sentiment prediction model is used to perform preliminary sentiment category prediction on the text vector generated in step 101 from a semantic perspective, obtaining the predicted sentiment labels (in vector form) of the text vector under multiple preset sentiment categories. Taking a product review scenario as an example, the review text W = {W1, W2, ..., W...} m The two types of personalized information, user information and product information, can be encoded by the encoding module to obtain the comment vector E. W User vector E U and product vector E P The comment vector E W It can be viewed as a text vector, user vector E U The product vector and the product vector can be considered as two individual vectors. In this embodiment, the model structure used for the sentiment prediction model is not limited; it can be a single model or a composite model.

[0027] For example, in order to take into account the advantages of sentiment prediction models in processing both long and short texts, as well as the advantages of serial and parallel processing capabilities, the TranLSTM model can be used. This model is mainly composed of a bidirectional long short-term memory (Bi-LSTM) network and a Transformers model trained sequentially. This composite model can not only solve the problem of capturing the semantics of long texts, but also capture the feature information of individualized information itself.

[0028] In a preferred implementation, the sentiment prediction model may specifically include: a bidirectional long short-term memory (Bi-LSTM) network, a Transformers model, and an additive network; correspondingly, the process of inputting text vectors into the sentiment prediction model to obtain predicted sentiment labels for the text vectors under multiple sentiment categories may include the following steps:

[0029] Step 1: Use a bidirectional long short-term memory network to extract word meaning features from the word vectors contained in the text vector to obtain word meaning feature vectors.

[0030] For example, for the product reviews mentioned above, the word embedding matrix E produced by the input layer can be used... w For text vectors, a Bi-LSTM network is preferentially used for feature extraction, as this network is well-suited for text vector classification tasks. The Bi-LSTM network consists of a forward LSTM network and a backward LSTM network, forming a bidirectional language network. In text sentiment classification tasks, the output at a given time step is not only related to the state information before that node but also often to the state information after that node. Therefore, a unidirectional LSTM network is prone to losing semantic features during semantic feature extraction, severely impacting the accuracy of the final text sentiment analysis. Using bidirectional encoding, propagating simultaneously from both the forward and backward directions, allows the network to better consider the temporal information within the text and perform deeper feature extraction from the overall input context, resulting in a more comprehensive and accurate acquisition of the final semantic feature information. In this embodiment, the bidirectional feature vectors extracted by the forward and backward LSTM networks are concatenated to obtain the word meaning feature vector corresponding to the text vector, as shown below:

[0031] W i ∈W

[0032]

[0033]

[0034]

[0035] The comment text is defined as: W = {W1, W2, ..., W...} m}, i∈{1,2,…,m}, where m is the number of words contained in the comment text. The words W are, in order, the i-th and (i-1)-th words. i W i-1 The output vector in the feedforward LSTM network, LSTM f (·) is the operator for the feedforward LSTM network; The words W are, in order, the i-th and the (i+1)-th words. i Wi+1 The output vector in the feedback LSTM network, LSTM b (·) is the operator for the backward LSTM network; h i For the i-th word W i Output vector in a bidirectional LSTM network.

[0036] Specifically, a bidirectional long short-term memory network is used to extract the semantic features of the comment text from each word vector contained in the text vector, resulting in the semantic feature vector of the comment text.

[0037] Step 2: Encode the word vectors contained in the text vectors using the positional encoding network in the Transformers model to obtain positional feature vectors.

[0038] Compared to traditional neural networks such as LSTM and GRU (Gateless Recurrent Unit) models, which suffer from severe forgetting issues in the encoded results of long texts due to their inability to perform parallel computation, Transformers models, as Seq2Seq type networks, perform exceptionally well in parallel tasks, thus offering better encoding results for long texts. Considering that user comments are more often presented as long texts, this embodiment combines the Bi-LSTM network with a Transformer model to capture the semantic information of long texts (specifically, the positional encoding network within the Transformers model is executed). The positional information extraction is as follows:

[0039]

[0040]

[0041] Where pos is the position of the current word in the current comment text, t is the position of the element in the current word vector, and d k Let represent the dimension of the current word vector, and PE(*) represent the position * feature. By concatenating the position features of all elements extracted from a word vector, and then concatenating the position features of each word vector in the text vector, we obtain the position feature vector of the text vector.

[0042] Step 3: The semantic feature vector and the positional feature are added together using an additive network, and the predicted sentiment labels of the text vector under multiple sentiment categories are formed based on the added vector.

[0043] Specifically, the semantic feature vector and positional feature vector extracted from each text vector are added together according to the correspondence of the word vectors belonging to the same word vector, and the predicted sentiment label of the text vector under multiple sentiment categories is formed based on the added vector.

[0044] In some examples, the summed vector can be directly used as the predicted sentiment label for the text vector across multiple sentiment categories. For instance, the summed vector is divided into multiple part vectors corresponding to preset sentiment categories, with each part vector serving as a predicted sentiment label for that category. In other examples, the summed vector can be further processed, such as by introducing a self-attention mechanism for weighted calculation, and the weighted vector is then used as the predicted sentiment label. This embodiment does not limit the specific processing performed on the summed vector.

[0045] For example, the weighted vectors after addition can be processed using the following steps:

[0046] Step 1: Form a weight term for each individual vector, and use the self-attention model in the Transformers model to set the weights to obtain adaptive weights.

[0047] Specifically, when forming weight terms, each individual vector can be directly used as a separate weight term, or the individual vectors can be further processed and used as weight terms. This embodiment does not limit the specific method for forming weight terms. After obtaining multiple weight terms, the weights can be set based on a self-attention model to obtain adaptive weights.

[0048] Accordingly, the process of predicting sentiment labels for multiple sentiment categories based on the summed vectors to form text vectors can include the following step two:

[0049] Step 2: Use adaptive weights to calculate the weights of the summed vectors to obtain the predicted sentiment labels of the text vectors under multiple sentiment categories.

[0050] Specifically, adaptive weights may include a weight vector or even a bias vector. Because the process of calculating the weights of the summed vectors differs when using adaptive weights, the resulting predicted sentiment labels will also differ.

[0051] For example, the predicted sentiment label y can be obtained using the following formula:

[0052]

[0053] Among them, c m Let (C) represent the category of the m-th weight term, and let (C) represent the sentiment category corresponding to the predicted sentiment label y. For the sentiment category (C), the weighted item category c m The weight, b (c) This is the bias vector under the sentiment category (C).

[0054] In the formula, It can be viewed as a vector of weight terms in adaptive weights, b (c) It can be viewed as a bias vector in adaptive weights.

[0055] Step 103: The text vector is fused with each individual vector using a cross-attention model and then concatenated to obtain the label weights.

[0056] This invention posits that individual biases among individualized information within the entire comment text influence the sentiment category expressed by the comment text in a global and bidirectional manner; that is, different types of individualized information mutually influence the comment text itself. Therefore, a cross-attention network is introduced into the sentiment analysis model trained in this application. Unlike self-attention networks, which limit the number of embedding sequences to one, cross-attention networks can asymmetrically combine two independent embedding sequences of the same dimension, achieving better interactive fusion. This embodiment utilizes the cross-attention model to fuse the text vector with each individual vector separately before concatenating them, thus using the resulting vector as the label weight for the previously predicted sentiment tag.

[0057] In some implementations, the aforementioned cross-attention model may include: a cross-attention network and a fusion network; correspondingly, the process of using the cross-attention model to fuse the text vector with each individual vector and then concatenate them to obtain the label weights may include the following steps.

[0058] Step 1: Using a cross-attention network, the first sequence formed by text vectors and the second sequence formed by each individual vector are fused using the following cross-attention algorithm to obtain the fused sequence I:

[0059] I = softmax((W Q S2)(W K S1) T W V S1

[0060] Where S1 is the first sequence, S2 is the second sequence, and W Q W K W V All are parameters of the matrix to be trained, (W) K S1) T For (W) K The transpose of S1).

[0061] The cross-attention algorithm used in this embodiment is a well-known algorithm and will not be described in detail here. The improvement of this embodiment lies in using the cross-attention algorithm to generate a fusion sequence I corresponding to the individualized information of the comment content. In the formula for calculating the fusion sequence I above, in order to fuse the text vector and each individual vector separately, the text vector and each individual vector can be first one-dimensionalized to form corresponding one-dimensional sequences, where the sequence corresponding to the text vector is denoted as the first sequence, and the sequence corresponding to each individual vector is denoted as the second sequence.

[0062] Specifically, the first and second sequences are input into a cross-attention network, with the second sequence used as the query to train and generate the key and value of the first sequence. The final fused sequence integrates the differential expressions of the comment text in personalized information.

[0063] For example, in Figure 2 In the model structure shown, taking the product review mentioned above as an example, the first sequence corresponding to the review text is called the review sequence S. D The second sequences corresponding to the two types of personalized information, namely user information and product information, are respectively called user sequences S. U and product sequence S P Then the formula for calculating the fusion sequence can be transformed into the following form.

[0064] I U =softmax((W Q S U (W) K S D ) T W V S D

[0065] I P =softmax((W Q S P (W) K S D ) T W V S D

[0066] Among them, I U For the comment sequence S D With user sequence S U The fusion sequence (referred to as "user fusion sequence"), I P For the comment sequence S D With product sequence S P The fusion sequence (referred to as "commodity fusion sequence").

[0067] User fusion sequences focus more on the impact of individual user biases on the information in the comment text, i.e., which words in the comment text users pay more attention to, such as the intensity of sentiment words and usage habits. User fusion sequences can better incorporate individual user biases into the comment text;

[0068] Product fusion sequences focus more on the impact of individual product bias information on the information in the review text, that is, which words in the review text best reflect the product characteristics, such as how the product's characteristics are frequently described. Product fusion sequences allow for a better integration of individual product bias information into the review text.

[0069] Step 2: The fusion sequence I corresponding to each individual vector is fused using the fusion network (⊕) to obtain the label weights.

[0070] Specifically, the fusion network integrates comment information that incorporates individual bias information, such as... Figure 2 The code sequence of comments, which incorporates user-specific bias information and product-specific bias information, or even further incorporates the original pure comment text information, is concatenated and fused to obtain the final tag weight (tag weight matrix G), i.e.:

[0071] G=α(S D ,I U ,I P )

[0072] Here, α(·) represents any trainable neural network that can fuse three input vectors, corresponding to the aforementioned user fusion sequence, product fusion sequence, and text sequence, respectively.

[0073] Step 104: A weighted calculation network is used to perform weighted calculations on each predicted sentiment label using label weights, and the calculation results are subjected to linear regression to obtain the sentiment category probabilities under multiple sentiment categories.

[0074] Specifically, the predicted sentiment labels obtained in step 102 are weighted using label weights to obtain the final weighted predicted sentiment labels. Then, the final predicted sentiment labels generated under each sentiment category are linearly regressed from a multidimensional vector, such as using a softmax activation function, to finally obtain the sentiment category (class) of the comment text.

[0075] class = softmax(y')

[0076] Where y' is the predicted sentiment label after weighting by label weights.

[0077] Step 105: The sentiment analysis model, consisting of the sentiment prediction model, the cross-attention model, and the weight calculation network, is trained using a classification loss function.

[0078] Specifically, during training, the sentiment analysis model, composed of the sentiment prediction model, the cross-attention model, and the weight calculation network, can be jointly trained, or it can be cascaded in stages. The loss function used for model training can be, but is not limited to, the classification loss function.

[0079] For example, the classification loss function is constructed using the following formula:

[0080]

[0081] Where N is the number of training comment texts, M is the number of preset sentiment categories, and y ij It is the true probability value of the i-th comment text belonging to the j-th sentiment category. It is the predicted probability value of the i-th comment text belonging to the j-th sentiment category.

[0082] The text sentiment analysis model training method provided in this application encodes the comment text and at least one individualized information of the comment text to obtain text vectors and at least one individual vector. The text vectors are then input into a sentiment prediction model to obtain predicted sentiment labels for the text vectors under multiple sentiment categories. A cross-attention model is used to fuse the text vectors with each individual vector and then concatenate them to obtain label weights. A weight calculation network is used to perform weighted calculations on each predicted sentiment label using the label weights, and the calculation results are subjected to linear regression to obtain sentiment category probabilities under multiple sentiment categories. A classification loss function is used to train the sentiment analysis model composed of the sentiment prediction model, the cross-attention model, and the weight calculation network. Because the introduction of the cross-attention mechanism allows the generated label weights to reflect the individualized information of the comment text, and the weighted calculations of the initially obtained predicted sentiment labels using these personalized label weights make the final predicted sentiment category probabilities more consistent with the true sentiment of different individuals, thereby improving the accuracy of the model in analyzing text sentiment categories.

[0083] The embodiments of this application also provide a text sentiment analysis method, such as... Figure 3 As shown, the steps include the following.

[0084] Step 201: Encode at least one individualized piece of information from the comment text to be analyzed, and obtain the text vector to be analyzed and the at least one individual vector to be analyzed.

[0085] Step 202: Input the text vector to be analyzed and at least one individual vector to be analyzed into the trained sentiment analysis model to obtain the sentiment category to which the comment text to be analyzed belongs.

[0086] The sentiment analysis model can be obtained using the text sentiment analysis model training method described in the above embodiments.

[0087] The text sentiment analysis method provided in this embodiment uses the sentiment analysis model trained by the model training method in the above embodiment to predict the sentiment category. It uses a cross-attention mechanism to fuse the differences in sentiment expression of comment text under individual bias to form a label weight, and uses the label weight to weight the predicted sentiment label of the comment text to predict the sentiment category, thereby improving the accuracy of text sentiment category analysis.

[0088] The steps of the various methods described above are only for clarity. In practice, they can be combined into one step or some steps can be split into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but without changing the core design of the algorithm and process, are also within the scope of protection of this patent.

[0089] The embodiments of this application relate to an electronic device, such as... Figure 4 As shown, it includes:

[0090] At least one processor 301; and a memory 302 communicatively connected to at least one processor 301; wherein the memory 302 stores instructions executable by at least one processor 301, the instructions being executed by at least one processor 301 to enable at least one processor 301 to perform the text sentiment analysis model training method as mentioned in the above embodiments, or to perform the text sentiment analysis method mentioned in the above embodiments.

[0091] The electronic device includes: one or more processors 301 and a memory 302. Figure 4 Taking a processor 301 as an example, the processor 301 and the memory 302 can be connected via a bus or other means. Figure 4 Taking a bus connection as an example, memory 302, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. For example, in this embodiment, the algorithms corresponding to each processing strategy in the strategy space are stored in memory 302. Processor 301 executes various functional applications and data processing of the device by running the non-volatile software programs, instructions, and modules stored in memory 302, thereby implementing the aforementioned model training method or text analysis method.

[0092] Memory 302 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store an option list, etc. Furthermore, memory 302 may include high-speed random access memory and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 302 may optionally include memory remotely located relative to processor 301, and these remote memories can be connected to external devices via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0093] One or more modules are stored in memory 302 and, when executed by one or more processors 301, perform the model training method in any of the above embodiments, or are able to perform the text analysis mentioned in the above embodiments.

[0094] The above-mentioned products can perform the methods provided in the embodiments of this application, and have the corresponding functional modules and beneficial effects of performing the methods. For technical details not described in detail in this embodiment, please refer to the methods provided in the embodiments of this application.

[0095] Embodiments of this application relate to a computer-readable storage medium storing a computer program. When executed by a processor, the computer program implements the above-described method embodiments.

[0096] That is, those skilled in the art will understand that all or part of the steps in the methods of the above embodiments can be implemented by a program instructing related hardware. This program is stored in a storage medium and includes several instructions to cause a device (which may be a microcontroller, chip, etc.) or processor to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as a USB flash drive, a portable hard drive, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

[0097] Those skilled in the art will understand that the above embodiments are specific implementations of this application, and in practical applications, various changes can be made in form and detail without departing from the spirit and scope of this application.

Claims

1. A method for training a text sentiment analysis model, characterized in that, include: The comment text and at least one individualized piece of information of the comment text are encoded to obtain a text vector and at least one individual vector; The text vector is input into the sentiment prediction model to obtain the predicted sentiment labels of the text vector under multiple sentiment categories; A cross-attention model is used to fuse the text vector with each individual vector and then concatenate them to obtain the label weights. A weighted calculation network is used to perform weighted calculation on each of the predicted sentiment labels using the label weights, and the calculation results are subjected to linear regression to obtain the sentiment category probabilities under the multiple sentiment categories; The sentiment analysis model, consisting of the sentiment prediction model, the cross-attention model, and the weight calculation network, is trained using a classification loss function. The sentiment prediction model includes: a bidirectional long short-term memory network, a Transformers model, and an additive network; The step of inputting the text vector into the sentiment prediction model to obtain the predicted sentiment labels of the text vector under multiple sentiment categories includes: The word vectors contained in the text vector are used to extract word meaning features using the bidirectional long short-term memory network to obtain word meaning feature vectors; The word vectors contained in the text vector are encoded using the position encoding network in the Transformers model to obtain position feature vectors; The addition network adds the word meaning feature vector and the position feature in a symmetrical manner, and forms the predicted sentiment label of the text vector under multiple sentiment categories based on the added vector; The cross-attention model includes: a cross-attention network and a fusion network; The method of using a cross-attention model to fuse the text vector with each individual vector and then concatenate them to obtain the label weights includes: The cross-attention network is used to fuse the first sequence formed by the text vectors and the second sequence formed by each individual vector using the following cross-attention algorithm to obtain a fused sequence. I : in, S 1 represents the first sequence. S 2 is the second sequence. W Q 、W K 、W V All of these are parameters of the matrix to be trained. W K S 1) T for( W K S 1) the transpose of the matrix; The fusion network then uses the fusion sequence corresponding to each individual vector to... I The labels are then fused to obtain the label weights.

2. The text sentiment analysis model training method according to claim 1, characterized in that, The encoding of the comment text and at least one individualized piece of information from the comment text to obtain a text vector and at least one individual vector includes: The comment text is encoded using word embeddings, and the resulting word embedding matrix is ​​used as the text vector. Each of the at least one individualized information is encoded by word embedding, and each word embedding matrix in the resulting at least one word embedding matrix is ​​used as an individual vector.

3. The text sentiment analysis model training method according to claim 1, characterized in that, The method further includes: Each individual vector is used to form a weight term, and the self-attention model in the Transformers model is used to set the weights to obtain adaptive weights. The process of forming predicted sentiment labels for the text vector across multiple sentiment categories based on the summed vectors includes: The adaptive weights are used to calculate the weights of the summed vectors to obtain the predicted sentiment labels of the text vectors under multiple sentiment categories.

4. The text sentiment analysis model training method according to claim 3, characterized in that, The step of using the adaptive weights to calculate the weights of the summed vector to obtain the predicted sentiment labels of the text vector under multiple sentiment categories includes: The predicted sentiment label is obtained using the following formula. y : in, c m For the first m The category of each weighted term, (c) represents the predicted sentiment label. y Corresponding emotional categories Weighted category under sentiment category (c) c m The weight, d is the bias vector under the sentiment category (c), and d is the vector after addition.

5. The text sentiment analysis model training method according to any one of claims 1-4, characterized in that, The individualized information includes: the comment subject information and / or comment object information of the comment text.

6. A text sentiment analysis method, characterized in that, include: The text to be analyzed and at least one individualized piece of information of the text to be analyzed are encoded to obtain the text vector to be analyzed and at least one individual vector to be analyzed. The text vector to be analyzed and the at least one individual vector to be analyzed are input into the trained sentiment analysis model to obtain the sentiment category to which the comment text to be analyzed belongs; The sentiment analysis model is obtained by the text sentiment analysis model training method according to any one of claims 1-5.

7. An electronic device, characterized in that, include: At least one processor; as well as, A memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the text sentiment analysis model training method as described in any one of claims 1 to 5, or to perform the text sentiment analysis method as described in claim 6.

8. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the text sentiment analysis model training method of any one of claims 1 to 5, or implements the text sentiment analysis method of claim 6.