A sentiment analysis method based on global training space

By constructing a TL-BERT model with a global training space, the problems of sample selection bias and informal text processing in existing technologies are solved, achieving higher recognition accuracy and generalization ability, simplifying computation, and making it suitable for sentiment analysis tasks.

CN116306671BActive Publication Date: 2026-06-26XIHUA UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XIHUA UNIV
Filing Date
2023-03-09
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies suffer from sample selection bias, poor generalization performance, and lack of ability to process informal text in target-oriented opinion word extraction tasks. They also struggle to effectively extract aspect terms and opinion expressions from informal texts.

Method used

A sentiment analysis method based on a global training space is adopted. The social media text data is preprocessed and trained using the TL-BERT model. The global and local training spaces are constructed using the BERT-base layer, multi-head attention layer, long short-term memory network and decoder to extract aspect terms and opinion words, and sentiment prediction is performed using a second TL-BERT model.

Benefits of technology

It improves the model's generalization ability, simplifies the computation, solves the one-to-many and many-to-one problems between target words and opinion words, enhances the ability to process informal text, and improves recognition accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116306671B_ABST
    Figure CN116306671B_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on global training space sentiment analysis method, including the following steps: obtaining the text data of social media, informal text data is handled, obtain natural language, and natural language is divided, obtain global and local training space data;TL-BERT model is constructed, and global and local training space data are used to train first TL-BERT model;Aspect term and opinion word are extracted from natural language using trained first TL-BERT model, aspect-opinion pair and natural language are constituted and input second TL-BERT model for training;According to trained first TL-BERT model, trained second TL-BERT model is used to carry out sentiment prediction to target text, and the final sentiment polarity is obtained.The application introduces global training space, with good generalization, simplify calculation, improve recognition accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of natural language processing technology, specifically including a sentiment analysis method based on a global training space. Background Technology

[0002] Text sentiment analysis, also known as opinion mining, can be categorized by analytical granularity into document-level sentiment analysis, sentence-level sentiment analysis, and aspect-level sentiment analysis.

[0003] In aspect-based sentiment analysis, Goal-Oriented Opinion Word Extraction (TOWE) aims to extract relevant opinion words based on aspect terms. Previous studies have neglected implicit aspects in the modeling process of the TOWE task, focusing only on explicit aspects.

[0004] Goal-Oriented Opinion Extraction (TOWE) is a subtask of fine-grained sentiment analysis. In this task, given specific aspect terms and their associated context, the TOWE task aims to extract opinion terms associated with those specific aspects. However, the problem lies in the fact that TOWE has two types of training samples. In explicit aspects, an aspect term is associated with at least one opinion term, while in implicit aspects, an aspect term has no corresponding opinion term. Previous studies have only used explicit aspects to train and evaluate their models, leading to sample selection bias. Specifically, previous TOWE models were trained only on explicit aspects, and these models are then used to infer both explicit and implicit aspects in the global space. Therefore, generalization performance suffers.

[0005] Secondly, existing technologies have some shortcomings. First, because informal texts are typically short, some sentences cannot effectively extract suitable aspect terms (i.e., targets) and opinion expressions (opinion words), so these sentences need to be cleaned. Second, informal texts contain a lot of informal language and social media tags; this noise can significantly interfere with lexical and syntactic analysis, so it is necessary to restore informal texts to natural language. Third, since opinion expressions in a sentence are specific to the aspect terms within that sentence, it is necessary to define relevant similarity calculation formulas to characterize the degree of association between words. Summary of the Invention

[0006] In view of the above-mentioned shortcomings in the prior art, the present invention provides a sentiment analysis method based on a global training space, which solves the problems of sample selection bias, poor generalization performance and lack of processing of informal text in the prior art.

[0007] To achieve the above-mentioned objectives, the technical solution adopted by this invention is: a sentiment analysis method based on a global training space, comprising the following steps:

[0008] S1. Obtain text data from social media, process informal text data to obtain natural language, and segment the natural language to obtain global and local training space data.

[0009] S2. Construct the TL-BERT model and train the first TL-BERT model using global and local training space data;

[0010] S3. Use the trained first TL-BERT model to extract aspect terms and opinion words from natural language, and construct aspect-opinion pairs;

[0011] S4. Input aspect-opinion pairs and natural language into the second TL-BERT model to train it, and obtain the trained second TL-BERT model.

[0012] S5. Based on the trained first TL-BERT model and the trained second TL-BERT model, perform sentiment prediction on the target text to obtain the final sentiment polarity.

[0013] Furthermore, the specific implementation of step S1 is as follows:

[0014] S1-1. Obtain text data from social media, and extract informal text data from the text data, denoted as t1, t2, t3, ..., t i ,...,t n ;

[0015] S1-2. Remove the content added to the text by social media in the informal text data to obtain the data after the first cleaning; including username links (e.g., "@username") and reply content (e.g., "reply to @username");

[0016] S1-3. Remove emojis, emoticons, and non-English content from the data after the first cleaning to obtain the data after the second cleaning.

[0017] S1-4. Replace the hashtags in the social media posts in the data after the second cleanup with the original text to obtain the data after the third cleanup.

[0018] S1-5. The data after the third cleanup is segmented using ".", ".", "!", "!", "?", and "?" as the standard to obtain natural language;

[0019] S1-6, Training samples for explicit and implicit aspects of natural language annotation;

[0020] S1-7. Divide the natural language training space data into global and local training space data, that is, perform global space training set division, global space validation test set division, local space training set division, and local space validation test set division.

[0021] Furthermore, the TL-BERT model includes a BERT-base layer, a multi-head attention layer, a long short-term memory network, and a decoder; the output of the BERT-base layer is connected to the input of the multi-head attention layer; the output of the multi-head attention layer is connected to the input of the long short-term memory network; and the output of the long short-term memory network is connected to the input of the decoder.

[0022] Furthermore, the specific implementation of step S2 is as follows:

[0023] S2-1. Transform sentences S in the global and local training space data into new sentences S2-1. B ={w0,...,w i ,...,w q}; where w0 is the character "[CLS]"; w i w q All are the characters "[SEP]";

[0024] S2-2, Generate segment index I for new sentences s ={0,...,0} and position index I p = {0,...,q}, where q represents the number of words in the new sentence;

[0025] S2-3, Transform the new sentence S B Segment Index of New Sentences I s and the position index of the new sentence I p Input the first TL-BERT model, train the first TL-BERT model, and obtain the trained first TL-BERT model.

[0026] Furthermore, the specific implementation of steps S2-3 is as follows:

[0027] S2-3-1, Transform the new sentence S B Segment Index of New Sentences I s and the position index of the new sentence I p Input the BERT-base layer to obtain a new sentence S B The word vector of each word in the sentence is obtained, and the word vectors are combined into the corresponding sentence vector E.

[0028] S2-3-2, According to the formula:

[0029] Q = W Q *E

[0030] K = WK *E

[0031] V = W V *E

[0032] We obtain the query vector Q, key-value pair vectors K and V; where E represents the sentence vector; W Q W is the weight matrix; K W is the weight matrix; V Here is the weight matrix; Q, K, V ∈ R m×d , where d is the number of hidden units in the neural network, m is the sequence length; R is a matrix, meaning that the three vectors Q, K, and V are actually a matrix, and the matrix R is m rows and n columns.

[0033] S2-3-3, Input the query vector Q, key-value pair vector K, and V into the multi-head attention layer, according to the formula:

[0034]

[0035] Obtain the attention vector ma of the i-th head in the multi-head attention layer. i Where, softmax represents the activation function, and softmax applies the attention vector ma i The output value of Q is [0,1]; i K is the query vector for the i-th head of the multi-head attention layer; i V represents the key value of the i-th head in the multi-head attention layer. i Let i be the key value of the i-th head in the multi-head attention layer;

[0036] S2-3-4, According to the formula:

[0037]

[0038] Obtain the aspect term vector A; where N represents the total number of attention heads; W i a This is the weight matrix;

[0039] S2-3-5. Input the aspect term vector into the Long Short-Term Memory network, according to the formula:

[0040] i t =σ(W xi A t +W hi h t-1 +b i )

[0041] f t =σ(W xf A t +W hf h t-1 +bf )

[0042]

[0043]

[0044] o t =σ(W xo A t +W ho h t-1 +b o )

[0045] h t =o t tanh(c t )

[0046] Get the hidden state h at the current moment t ; where i t W is the output of the output gate. xi W hi Let b be the weight matrix. i For bias; f t For the output of the forget gate, W xf W hf Let b be the weight matrix. f For bias; A t σ is the aspect term vector at the current time; σ is the sigmoid activation function with a value range of [0,1]. For candidate cell state, W xc W hc Let b be the weight matrix. c tanh is the bias; tanh is the activation function, with a value range of [-1, 1], h t-1 The hidden state from the previous moment; c t The current cell state; i t The output of the input gate, c t-1 This refers to the cell state at the previous moment; t W is the output of the output gate. xo W ho Let b be the weight matrix. o For bias;

[0047] S2-3-6, Set the current hidden state h t Input the decoder to obtain the hidden state of the new sentence.

[0048] S2-3-7, According to the formula:

[0049]

[0050]

[0051]

[0052] They were obtained respectively Corresponding tags The probability H1 of being B Corresponding tags The probability H2 of being I and Corresponding tags The probability of W1 being 0 is H3; where W1 A b1 A W2 A , W3 A , These are learnable parameters; B indicates start, I indicates interior, and O indicates other.

[0053] S2-3-8. Compare H1, H2, and H3, and denote the maximum predicted probability value as... The label corresponding to the highest predicted probability value is used as the hidden state. Corresponding tags

[0054] S2-3-9, According to the formula:

[0055]

[0056] Obtain the loss value L ATEU and return the loss value L ATEU Modify the model parameters; where p represents the p-th word; This represents the probability that the label corresponding to the p-th word is "s"; This represents the label corresponding to the p-th word; the base of the logarithm is e; This is a collection of sentences transformed from sentences in the global and local training space data.

[0057] S2-3-10. Repeat steps S2-3-1 to S2-3-9 until the predetermined number of repetitions is reached to complete model training.

[0058] Furthermore, the specific implementation of step S3 is as follows:

[0059] S3-1. Extracting sentences from natural language;

[0060] S3-2. Feed the sentence into the trained first TL-BERT model to obtain the aspect terms E = {a0,...,a...} corresponding to the sentence. j ,...,a m'-1}; where m' indicates that there are a total of m' aspect terms,

[0061] S3-3. Add the corresponding aspect terms to the end of the sentence and separate them with the character [SEP] to obtain a sentence with prompts.

[0062] S3-4. For sentences with prompts Generate the corresponding segment index and corresponding position index Where q' represents a sentence with a prompt. There are q' words in total;

[0063] S3-5, sentences with prompts Segment Index and position index Input the first trained TL-BERT model to obtain aspect term a j Corresponding opinion words Among them, l j This indicates the number of opinion terms in the j-th aspect.

[0064] S3-6. Combine aspect terms and opinion words to form aspect-opinion pairs.

[0065] Furthermore, the specific implementation of step S4 is as follows:

[0066] S4-1. Based on aspects - opinions - opinions words - sentences. Label the opinion words and aspect terms in the text;

[0067] S4-2. Input the labeled sentence, the segment index of the labeled sentence, and the position index of the labeled sentence into the second TL-BERT model to obtain the hidden state of the labeled sentence.

[0068] S4-3, Hiding the state of tagged sentences Calculate the average value to obtain

[0069] S4-4. According to the formula:

[0070]

[0071]

[0072] get The corresponding emotional polarity is positive, with probabilities H' and H' being positive. The probability H of the corresponding emotional polarity being negative; W1 S , This is the weight matrix; These are learnable parameters;

[0073] S4-5. Compare H' and H', and denote the maximum probability value as . and the maximum probability value The corresponding emotional polarity is denoted as That is, emotion tags;

[0074] S4-6. According to the formula:

[0075]

[0076] The loss value L of the second TL-BERT model is obtained. AOPSC and return the loss value L AOPSC Modify the model parameters; where k represents the k-th opinion word of the j-th aspect term; The probability that the label corresponding to the k-th opinion word of the j-th aspect term is s'; The label corresponding to the kth opinion term of the jth aspect term;

[0077] S4-7. Repeat steps S4-1 to S4-6 until the predetermined number of times; complete the training of the second TL-BERT model and obtain the trained second TL-BERT model.

[0078] Furthermore, the specific implementation of step S5 is as follows:

[0079] S5-1. Input the target text into the trained first TL-BERT model to obtain aspect terms and opinion words, which constitute aspect opinion pairs of the target text;

[0080] S5-2. Use the aspect opinions of the target text to label the target text, and input the labeled target text into the trained second TL-BERT model, and use the output label as the sentiment polarity of the target text.

[0081] The beneficial effects of this invention are as follows: the invention introduces a global training space, making the training of the model closer to the context of the real world, giving the model better generalization ability and simplifying the amount of computation; it solves the one-to-many and many-to-one problems between target words and opinion words; and it adds processing for informal text, improving recognition accuracy. Attached Figure Description

[0082] Figure 1 This is a flowchart of the present invention;

[0083] Figure 2 Diagram of the ATEU and TOWEU task framework;

[0084] Figure 3 Here is a diagram of the TL-BERT model structure, where, Figure 3(a) is the first part. Figure 3 (b) is the second part;

[0085] Figure 4 A comparison plot of macro-F1 exponents for different models on the SemEval-14 Laptop dataset;

[0086] Figure 5 A comparison plot of macro-F1 exponents for different models on the SemEval-14Restaurant dataset;

[0087] Figure 6 A comparison plot of macro-F1 exponents for different models on the SemEval-15Restaurant dataset;

[0088] Figure 7 This is a comparison plot of the macro-F1 exponents of different models on the SemEval-16Restaurant dataset. Detailed Implementation

[0089] The specific embodiments of the present invention are described below to enable those skilled in the art to understand the present invention. However, it should be understood that the present invention is not limited to the scope of the specific embodiments. For those skilled in the art, various changes are obvious as long as they are within the spirit and scope of the present invention as defined and determined by the appended claims. All inventions utilizing the concept of the present invention are protected.

[0090] like Figure 1 As shown, a sentiment analysis method based on a global training space includes the following steps:

[0091] S1. Obtain text data from social media, process informal text data to obtain natural language, and segment the natural language to obtain global and local training space data.

[0092] S2. Construct the TL-BERT model and train the first TL-BERT model using global and local training space data;

[0093] S3. Use the trained first TL-BERT model to extract aspect terms and opinion words from natural language, and construct aspect-opinion pairs;

[0094] S4. Input aspect-opinion pairs and natural language into the second TL-BERT model to train it, and obtain the trained second TL-BERT model.

[0095] S5. Based on the trained first TL-BERT model and the trained second TL-BERT model, perform sentiment prediction on the target text to obtain the final sentiment polarity.

[0096] The specific implementation method of step S1 is as follows:

[0097] S1-1. Obtain text data from social media, and extract informal text data from the text data, denoted as t1, t2, t3, ..., t i ,...,t n ;

[0098] S1-2: Remove the content added to the text by social media in the informal text data to obtain the data after the first cleaning;

[0099] S1-3. Remove emojis, emoticons, and non-English content from the data after the first cleaning to obtain the data after the second cleaning.

[0100] S1-4. Replace the hashtags in the social media posts in the data after the second cleanup with the original text to obtain the data after the third cleanup.

[0101] S1-5. The data after the third cleanup is segmented using ".", ".", "!", "!", "?", and "?" as the standard to obtain natural language;

[0102] S1-6, Training samples for explicit and implicit aspects of natural language annotation;

[0103] S1-7. Divide the natural language training space data into global and local training space data, that is, perform global space training set division, global space validation test set division, local space training set division, and local space validation test set division.

[0104] The specific implementation method of step S2 is as follows:

[0105] S2-1. Transform sentences S in the global and local training space data into new sentences S2-1. B ={w0,...,w i ,...,w q}; where w0 is the character "[CLS]"; w i w q All are the characters "[SEP]";

[0106] S2-2, Generate segment index I for new sentences s ={0,...,0} and position index I p = {0,...,q}, where q represents the number of words in the new sentence;

[0107] S2-3, Translate the new sentence S B Segment Index of New Sentences I s and the position index of the new sentence I pInput the first TL-BERT model, train the first TL-BERT model, and obtain the trained first TL-BERT model.

[0108] The specific implementation method of step S2-3 is as follows:

[0109] S2-3-1, Transform the new sentence S B Segment Index of New Sentences I s and the position index of the new sentence I p Input the BERT-base layer to obtain a new sentence S B The word vector of each word in the sentence is obtained, and the word vectors are combined into the corresponding sentence vector E.

[0110] S2-3-2, According to the formula:

[0111] Q = W Q *E

[0112] K = W K *E

[0113] V = W V *E

[0114] We obtain the query vector Q, key-value pair vectors K and V; where E represents the sentence vector; W Q W is the weight matrix; K W is the weight matrix; V Here is the weight matrix; Q, K, V ∈ R m×d , where d is the number of hidden units in the neural network, m is the sequence length; R is a matrix, meaning that the three vectors Q, K, and V are actually a matrix, and the matrix R is m rows and n columns.

[0115] S2-3-3, Input the query vector Q, key-value pair vector K, and V into the multi-head attention layer, according to the formula:

[0116]

[0117] Obtain the attention vector ma of the i-th head in the multi-head attention layer. i Where, softmax represents the activation function, and softmax applies the attention vector ma i The output value of Q is [0,1]; i K is the query vector for the i-th head of the multi-head attention layer; i V represents the key value of the i-th head in the multi-head attention layer. i Let i be the key value of the i-th head in the multi-head attention layer;

[0118] S2-3-4, According to the formula:

[0119]

[0120] Obtain the aspect term vector A; where N represents the total number of attention heads; W i a This is the weight matrix;

[0121] S2-3-5. Input the aspect term vector into the Long Short-Term Memory network, according to the formula:

[0122] i t =σ(W xi A t +W hi h t-1 +b i )

[0123] f t =σ(W xf A t +W hf h t-1 +b f )

[0124]

[0125]

[0126] o t =σ(W xo A t +W ho h t-1 +b o )

[0127] h t =o t tanh(c t )

[0128] Get the hidden state h at the current moment t ; where i t W is the output of the output gate. xi W hi Let b be the weight matrix. i For bias; f t For the output of the forget gate, W xf W hf Let b be the weight matrix. f For bias; A t σ is the aspect term vector at the current time; σ is the sigmoid activation function, with a value range of [0,1]. For candidate cell state, W xc W hc Let b be the weight matrix. c tanh is the bias; tanh is the activation function, with a value range of [-1, 1], ht-1 The hidden state from the previous moment; c t The current cell state; i t The output of the input gate, c t-1 This refers to the cell state at the previous moment; t W is the output of the output gate. xo W ho Let b be the weight matrix. o For bias;

[0129] S2-3-6, Set the current hidden state h t Input the decoder to obtain the hidden state of the new sentence.

[0130] S2-3-7, According to the formula:

[0131]

[0132]

[0133]

[0134] They were obtained respectively Corresponding tags The probability H1 of being B Corresponding tags The probability H2 of being I and Corresponding tags The probability of W1 being 0 is H3; where W1 A b1 A W2 A , W3 A , These are learnable parameters; B indicates start, I indicates interior, and O indicates other.

[0135] S2-3-8. Compare H1, H2, and H3, and denote the maximum predicted probability value as... The label corresponding to the highest predicted probability value is used as the hidden state. Corresponding tags

[0136] S2-3-9, According to the formula:

[0137]

[0138] Obtain the loss value L ATEU and return the loss value L ATEU Modify the model parameters; where p represents the p-th word; This represents the probability that the label corresponding to the p-th word is "s"; This represents the label corresponding to the p-th word; the base of the logarithm is e; This is a collection of sentences transformed from sentences in the global and local training space data.

[0139] S2-3-10. Repeat steps S2-3-1 to S2-3-9 until the predetermined number of repetitions is reached to complete model training.

[0140] The specific implementation method of step S3 is as follows:

[0141] S3-1. Extracting sentences from natural language;

[0142] S3-2. Feed the sentence into the trained first TL-BERT model to obtain the aspect terms E = {a0,...,a...} corresponding to the sentence. j ,...,a m'-1}; where m' indicates that there are a total of m' aspect terms,

[0143] S3-3. Add the corresponding aspect terms to the end of the sentence and separate them with the character [SEP] to obtain a sentence with prompts.

[0144] S3-4. For sentences with prompts Generate the corresponding segment index and the corresponding position index Where q' represents a sentence with a prompt. There are q' words in total;

[0145] S3-5, sentences with prompts Segment Index and position index Input the first trained TL-BERT model to obtain aspect term a j Corresponding opinion words Among them, l j This indicates the number of opinion terms in the j-th aspect.

[0146] S3-6. Combine aspect terms and opinion words to form aspect-opinion pairs.

[0147] The specific implementation method of step S4 is as follows:

[0148] S4-1. Match sentences based on aspects and opinions. Label the opinion words and aspect terms in the text;

[0149] S4-2. Input the labeled sentence, the segment index of the labeled sentence, and the position index of the labeled sentence into the second TL-BERT model to obtain the hidden state of the labeled sentence.

[0150] S4-3, Hiding the state of tagged sentences Calculate the average value to obtain

[0151] S4-4. According to the formula:

[0152]

[0153]

[0154] get The corresponding emotional polarity is positive, with probabilities H' and H' being positive. The probability H of the corresponding emotional polarity being negative; W1 S W2 S This is the weight matrix; These are learnable parameters;

[0155] S4-5. Compare H' and H', and denote the maximum probability value as . and the maximum probability value The corresponding emotional polarity is denoted as That is, emotion tags;

[0156] S4-6. According to the formula:

[0157]

[0158] The loss value L of the second TL-BERT model is obtained. AOPSC and return the loss value L AOPSC Modify the model parameters; where k represents the k-th opinion word of the j-th aspect term; The probability that the label corresponding to the k-th opinion word of the j-th aspect term is s'; The label corresponding to the kth opinion term of the jth aspect term;

[0159] S4-7. Repeat steps S4-1 to S4-6 until the predetermined number of times; complete the training of the second TL-BERT model and obtain the trained second TL-BERT model.

[0160] The specific implementation method of step S5 is as follows:

[0161] S5-1. Input the target text into the trained first TL-BERT model to obtain aspect terms and opinion words, which constitute aspect opinion pairs of the target text;

[0162] S5-2. Use the aspect opinions of the target text to label the target text, and input the labeled target text into the trained second TL-BERT model, and use the output label as the sentiment polarity of the target text.

[0163] like Figure 2 As shown, this invention proposes a sentiment analysis method with a global training space, which includes two aspect-based sentiment analysis subtasks trained using a pre-trained language model BERT: Aspect-Oriented Terminology Extraction (ATE) and Goal-Oriented Opinion Word Extraction (TOWE). The model is trained and evaluated in a global space with explicit and implicit aspects.

[0164] The global training space contains two types of training samples, thus introducing a selection bias problem. Specifically, the training and evaluation spaces consist of aspect terms associated with at least one opinion word (i.e., explicit aspect). It is only a part of the inference space, which consists of all aspect terms, while the other part consists of implicit aspects. Inference occurs in real-world scenarios where the TOWE model cannot extract opinion words only for explicit aspects.

[0165] The most fundamental difference between the TOWE task and other opinion word extraction tasks lies in its resolution of the one-to-many and many-to-one relationships between target terms and opinion words. That is, one aspect term may correspond to multiple opinion terms, and one opinion term may also correspond to multiple aspect terms. The TOWE task addresses these one-to-many and many-to-one problems through a multi-head attention mechanism. Each head focuses on different aspect terms through a query vector.

[0166] like Figure 3 (a) and Figure 3 As shown in (b), the TL-BERT model includes a BERT-base layer, a multi-head attention layer, a long short-term memory network, and a decoder; the output of the BERT-base layer is connected to the input of the multi-head attention layer; the output of the multi-head attention layer is connected to the input of the long short-term memory network; and the output of the long short-term memory network is connected to the input of the decoder.

[0167] In one embodiment of the present invention, to evaluate the effectiveness of the experiments, we conducted extensive experiments on four benchmark datasets of SemEval: SemEval-14Restaurant, SemEval-15Restaurant, SemEval-16Restaurant, and SemEval-14Laptop. The first three datasets are from SemEval challenge2014Task4, SemEval Challenge 2015task 12, and SemEval Challenge 2016task 5, respectively. The datasets provided by the SemEval challenge tasks all originate from the restaurant and laptop domains. These datasets are widely used in downstream tasks of ABSA, such as aspect category detection, opinion target extraction, opinion word extraction, and target-dependent sentiment analysis. Initially, these datasets were labeled in the form of {opinion target: sentiment polarity}, while the opinion expressions for these opinion targets were provided by IOG. We further labeled the datasets by adding positional information to each aspect term and opinion term. For example, in the sentence "The food is great and the price is very reasonable," the aspect term "food" has a start position of 1 and an end position of 2, so it is labeled as "aspect term": {"start": 1, "end": 2, "term": "food"}. The opinion expression "great" has a start position of 0 and an end position of 1, so it is labeled as "opinion term": {"start": 0, "end": 1, "term": "great"}. The statistics for these datasets are shown in Table 2. Statistics for our new TOWE dataset are also presented. The ratio represents the proportion of implicit aspects in the dataset to all instances in the dataset.

[0168] Table 2

[0169]

[0170] Like LOTN, BERT segments all sentences into individual words. It then uses "[CLS] sentence[SEP] aspect term[SEP]" as input. During training, Adam is used as the optimizer, with a learning rate of 2e-05 and weight decay of 0000.1. The maximum input word length is 120, the batch size is 32, the word embedding size is 300, and the position embedding size is 32. In the optimization iterations, the experiment used 100 iterations, but the code employed an early stopping strategy. It selected the iteration with the best performance on the validation set, loaded its parameters into the test set, and finally output the test set result. Instead of waiting for all iterations to complete before selecting the best result on the test set, it generally doesn't need to run 100 iterations; around 20 iterations are usually sufficient to obtain the optimal result.

[0171] Based on previous research on the TOWE task, this invention uses Precision, Recall, and Macro-F1 scores as evaluation metrics. The basic unit of experimental evaluation is the text span. For each opinion expression extracted for the target, the extracted opinion expression is considered correct only if both the start and end positions coincide with the golden positions on the test set. The strategy in this invention is to randomly select 20% of the samples from the training set as a validation set to search for hyperparameters. Then, according to the early stopping strategy, the parameters of the best-performing experiment on the validation set are loaded into the test set. The final result is the average of the results of five experimental runs.

[0172] To evaluate the effectiveness of the proposed method, the present invention was compared with the following baseline model, the results of which are as follows: Figure 4 , Figure 5 , Figure 6 , Figure 7 As shown below, greedy decoding was used on all baseline models, while opinion phrases were extracted using the method suggested in the paper for a fair comparison, as follows. From the first set of experiments in these two figures, it can be seen that for all datasets, the neural network model using deep learning consistently outperforms the rule-based policy model. This is theoretically reasonable because opinion expressions are complex and varied, with diverse forms. Designing rules that achieve both high precision and recall is impractical. Due to its generalization ability, the neural network-based model achieves higher precision and recall.

[0173] The second group of experiments refers to, for example Figure 4 , Figure 5 , Figure 6 , Figure 7Corresponding models are all neural models that take target-specific information as input, relying heavily on external information. However, over-reliance on external information also has drawbacks, reducing the model's generalization ability. If transfer to other domains is required, new external knowledge needs to be acquired based on the experimental context. Thanks to the joint learning framework proposed by SDRN, these methods extract opinions and identify relationships in different modules, thus largely avoiding the error propagation problem between opinion extraction and relationship identification, thereby improving model performance. State-of-the-art methods such as ONG, LOTN, and TS-GCN all incorporate and utilize external knowledge or datasets, such as document-level sentiment analysis and syntactic structure. Among them, ONG utilizes distance information in syntactic structure but ignores the dependencies in syntactic structure, while TS-GCN takes into account the dependencies in syntactic structure. Although our model does not incorporate dependencies in syntactic structure, it incorporates positional information and learns these hidden features through BERT pre-training.

[0174] This invention introduces a global training space to make the model training closer to the real-world context, giving the model better generalization ability and simplifying the computation; it solves the one-to-many and many-to-one problems between target words and opinion words; and it adds processing for informal text, improving recognition accuracy.

Claims

1. A sentiment analysis method based on a global training space, characterized in that, Includes the following steps: S1. Obtain text data from social media, process informal text data to obtain natural language, and segment the natural language to obtain global and local training space data. S2. Construct the TL-BERT model and train the first TL-BERT model using global and local training space data; S3. Use the trained first TL-BERT model to extract aspect terms and opinion words from natural language, and construct aspect-opinion pairs; S4. Input aspect-opinion pairs and natural language into the second TL-BERT model to train it, and obtain the trained second TL-BERT model. S5. Based on the trained first TL-BERT model and the trained second TL-BERT model, perform sentiment prediction on the target text to obtain the final sentiment polarity; The specific implementation method of step S1 is as follows: S1-1. Obtain text data from social media, extract informal text data from the text data, and denot it as... ; S1-2: Remove the content added to the text by social media in the informal text data to obtain the data after the first cleaning; S1-3. Remove emojis, emoticons, and non-English content from the data after the first cleaning to obtain the data after the second cleaning. S1-4. Replace the hashtags in the social media posts in the data after the second cleanup with the original text to obtain the data after the third cleanup. S1-5. The data after the third cleaning is segmented according to the standard of ".", ".", "!", "!", "?", "?" to obtain natural language; S1-6, Training samples for explicit and implicit aspects of natural language annotation; S1-7. Divide the natural language training space data into global and local training space data, that is, divide the global space training set, the global space validation test set, the local space training set, and the local space validation test set. The specific implementation method of step S3 is as follows: S3-1. Extracting sentences from natural language; S3-2. Feed the sentence into the trained first TL-BERT model to obtain the aspect terms corresponding to the sentence. ;in, Indicates shared ownership Terminology in various aspects ; S3-3. Add the corresponding aspect terms to the end of the sentence and separate them with the character [SEP] to obtain a sentence with prompts. ; S3-4. For sentences with prompts Generate the corresponding segment index and the corresponding position index ;in, Sentences with prompts Total One word; S3-5, sentences with prompts Segment Index and position index Input the first trained TL-BERT model to obtain aspect terms. Corresponding opinion words ;in, Indicates the first j The number of opinion terms in each aspect; S3-6. Construct aspect-opinion pairs by combining aspect terms and opinion terms; The specific implementation method of step S4 is as follows: S4-1. Match sentences based on aspects and opinions. Label the opinion words and aspect terms in the text; S4-2. Input the labeled sentence, the segment index of the labeled sentence, and the position index of the labeled sentence into the second TL-BERT model to obtain the hidden state of the labeled sentence. ; S4-3, Hiding the state of tagged sentences Calculate the average value to obtain ; S4-4. According to the formula: get The probability that the corresponding emotional polarity is positive. and The probability that the corresponding emotional polarity is negative. ; , This is the weight matrix; , These are learnable parameters; S4-5, Comparison and The maximum probability value is denoted as and the maximum probability value The corresponding emotional polarity is denoted as That is, emotion tags; S4-6. According to the formula: Obtain the loss value of the second TL-BERT model and return the loss value Modify the model parameters; among them, k Indicates the first j The first aspect of terminology k One opinion word; For the first j The first aspect of terminology k The tags corresponding to each opinion word are: The probability of that time; For the first j The first aspect of terminology k The tags corresponding to each opinion word; S4-7. Repeat steps S4-1 to S4-6 until the predetermined number of times; complete the training of the second TL-BERT model and obtain the trained second TL-BERT model.

2. The sentiment analysis method based on a global training space according to claim 1, characterized in that, The TL-BERT model consists of a BERT-base layer, a multi-head attention layer, a long short-term memory network, and a decoder. The output of the BERT-base layer is connected to the input of the multi-head attention layer. The output of the multi-head attention layer is connected to the input of the long short-term memory network. The output of the long short-term memory network is connected to the input of the decoder.

3. The sentiment analysis method based on a global training space according to claim 2, characterized in that, The specific implementation method of step S2 is as follows: S2-1. Transform sentences S in the global and local training space data into new sentences. ;in, The character "[CLS]" is used. , All are the characters "[SEP]"; S2-2, Generate segment indexes for new sentences. and position index ,in, q Indicates the total number of new sentences q One word; S2-3, New sentences Paragraph index of new sentences and the position index of the new sentence Input the first TL-BERT model, train the first TL-BERT model, and obtain the trained first TL-BERT model.

4. The sentiment analysis method based on a global training space according to claim 3, characterized in that, The specific implementation method of step S2-3 is as follows: S2-3-1, The new sentence Paragraph index of new sentences and the position index of the new sentence Input the BERT-base layer to obtain new sentences The word vector of each word is generated, and the word vectors are combined to form the corresponding sentence vector. E ; S2-3-2, According to the formula: We obtain the query vector Q, the key-value pair vector K, and V; where E represents the sentence vector. This is the weight matrix; This is the weight matrix; Here is the weight matrix; Q, K, V∈R m×d , where d is the number of hidden units in the neural network, m is the sequence length; R is a matrix, meaning that the three vectors Q, K, and V are actually a matrix, and the matrix R is m rows and n columns; S2-3-3, Input the query vector Q, key-value pair vector K, and V into the multi-head attention layer, according to the formula: Obtain the attention vector of the i-th head in the multi-head attention layer. ;in, soft max represents the activation function. soft max will focus on the attention vector The value is output as [0,1]; Let i be the query vector of the i-th head of the multi-head attention layer; Let i be the key value of the i-th head in the multi-head attention layer; Let i be the key value of the i-th head in the multi-head attention layer; S2-3-4, According to the formula: Obtain aspect term vector A ;in, N Indicates shared ownership N One point of attention; This is the weight matrix; S2-3-5. Input the aspect term vector into the Long Short-Term Memory network, according to the formula: Get the hidden state at the current moment ;in, For the output of the output gate, , This is the weight matrix. For bias; For the output of the forget gate, , This is the weight matrix. For bias; This is the aspect term vector at the current moment; is the sigmoid activation function, with a value range of [0,1]. Candidate cell state, , This is the weight matrix. For bias; The activation function takes values ​​in the range [-1, 1]. This is the hidden state from the previous moment; This represents the current state of the cell. The output of the input gate, This represents the cell state at the previous moment; For the output of the output gate, , This is the weight matrix. For bias; S2-3-6, Hide the current state. Input the decoder to obtain the hidden state of the new sentence. ; S2-3-7, According to the formula: They were obtained respectively Corresponding tags The probability of being B , Corresponding tags The probability of being I and Corresponding tags The probability of being 0 ;in, , , , , , These are learnable parameters; B indicates start, I indicates interior, and O indicates other. S2-3-8, Comparison , and The maximum probability value predicted is denoted as The label corresponding to the predicted maximum probability value is used as the hidden state. Corresponding tags ; S2-3-9, According to the formula: Obtain the loss value and return the loss value Modify the model parameters; among them, p Indicates the first p One word; Indicates the first p The tag corresponding to each word is The probability of that time; Indicates the first p The tags corresponding to each word; the base of the logarithm is e; This is a collection of sentences transformed from sentences in the global and local training space data. S2-3-10. Repeat steps S2-3-1 to S2-3-9 until the predetermined number of repetitions is reached to complete model training.

5. The sentiment analysis method based on a global training space according to claim 4, characterized in that, The specific implementation method of step S5 is as follows: S5-1. Input the target text into the trained first TL-BERT model to obtain aspect terms and opinion words, which constitute aspect opinion pairs of the target text; S5-2. Use the aspect opinions of the target text to label the target text, and input the labeled target text into the trained second TL-BERT model, and use the output label as the sentiment polarity of the target text.