A fine-grained sentiment analysis method based on a BERT model and an attention focusing network
By combining the BERT model with an attention-focused network, the problem of information loss in traditional sentiment analysis is solved, enabling more accurate fine-grained sentiment analysis and improving accuracy and F1 score.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LIAONING UNIVERSITY
- Filing Date
- 2022-12-07
- Publication Date
- 2026-06-16
AI Technical Summary
Traditional coarse-grained sentiment analysis of product reviews cannot accurately identify the sentiment tendencies from multiple perspectives in a single review, leading to information loss and inaccurate analysis results.
Using the BERT model and attention-focused network, key information is extracted for fine-grained sentiment analysis through preprocessing, word vector transformation, semantic information integration, attention mechanism and fully connected layer processing.
It improves the accuracy and F1 score of sentiment analysis, effectively solves the problem of missing information, and achieves more accurate aspect-level sentiment analysis.
Smart Images

Figure CN115730606B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of natural language processing, deep learning, and aspect-level sentiment analysis, and particularly to a fine-grained sentiment analysis method based on the BERT model and attention-focusing network. Background Technology
[0002] With the rapid development of internet technology and the increasing number of netizens, emerging industries such as social media and e-commerce have experienced rapid growth. More and more people share and publish comments and analyses on various products, services, events, and news online. Through sentiment analysis, these comments can reveal a wealth of valuable information. For example, businesses can obtain consumer feedback on a product, analyze its market value and potential for improvement, and thus generate better profits. Consumers can use this information to judge a product's reputation and quality, making more rational decisions about whether to purchase it. Organizations involved in news events can use this information to understand netizens' attitudes and sentiments towards an event, allowing for better follow-up responses.
[0003] Traditional coarse-grained sentiment analysis of product reviews identifies the overall sentiment expressed in the review. However, in some cases, a single review may cover multiple perspectives, and the sentiment towards these different perspectives may differ. For example, a review might express the opinion that a store's products are of high quality, but the store environment is average and its location is poor. Therefore, coarse-grained sentiment analysis of this review cannot capture its complete sentiment and will draw inaccurate conclusions. Fine-grained sentiment analysis, on the other hand, can identify the sentiment of individual given terms within a review, thereby deriving more accurate and comprehensive sentiment analysis conclusions and avoiding information loss.
[0004] Currently, there is a wealth of research in the field of fine-grained sentiment analysis, involving various neural networks, including Long Short-Term Memory (LSTM) artificial neural networks, convolutional neural networks, recurrent neural networks, and the BERT model, achieving good results. However, in fine-grained sentiment analysis, there may be multiple other perspectives on sentiment-related textual information, as well as implicit sentiment expressions. This can lead to information gaps during the training process, significantly interfering with the accuracy of the sentiment analysis results. Summary of the Invention
[0005] This invention provides a fine-grained sentiment analysis method based on the BERT model and attention-focusing network, which alleviates the problem of information loss during training in existing technologies and more effectively solves the aspect-level sentiment analysis problem.
[0006] This invention is achieved through the following technical solution: a fine-grained sentiment analysis method based on the BERT model and attention-focusing network, comprising the following steps:
[0007] S1: Obtain the sentence to be subjected to fine-grained sentiment analysis and its corresponding aspect words. After preprocessing, obtain the word vector representation of each word in the text.
[0008] The sentence to be analyzed for fine-grained sentiment analysis and its corresponding aspect words are concatenated into an input text sequence in the form of "[CLS] + sentence to be analyzed + [SEP] + aspect words + [SEP]", where [CLS] is the text start symbol and [SEP] is the text separator and end symbol. Then, the BERT model is used to transform the input text sequence of length x into a vector to obtain the word vector representation s of the text.
[0009] S2: Input the word vector representation sequence obtained in S1 into the BERT neural network model for processing to obtain the semantic information of each word vector after integrating with the context information;
[0010] Step S2 specifically includes: inputting the word vector representation s of the text into the BERT model for processing, and obtaining the hidden state of the last layer of the BERT model as the semantic information H of each word vector integrated with the context information. x*h , where h is the number of hidden layers in the BERT model.
[0011] S3: The semantic information obtained in S2 is analyzed and processed using an attention-focusing network layer constructed based on the attention mechanism to extract key information;
[0012] Step S3 specifically includes: extracting the semantic information H from the BERT model. x*h The information is extracted by an attention network layer constructed based on the attention mechanism. The specific structure of the attention network layer is as follows:
[0013] In the first layer of the attention network, H x*h After a linear transformation through a fully connected layer with no bias term, and then using the Sigmoid activation function, we obtain...
[0014]
[0015] In the second layer of the attention-focusing network layer After a linear transformation through a fully connected layer with no bias term, and then using the Softmax activation function, we obtain...
[0016]
[0017] In the third layer of the attention network, The semantic information H extracted by the BERT model x*h After matrix multiplication, the fully connected layer with unbiased input terms undergoes a linear transformation, followed by the Tanh activation function, to obtain...
[0018] in, These are trainable parameters;
[0019] Remove Key information is obtained from a dimension with a mid-dimensional value of 1.
[0020] S4: Input the key information extracted in S3 into the fully connected layer for fine-grained sentiment prediction, and obtain the analysis results of this fine-grained sentiment analysis method based on BERT model and attention concentration network;
[0021] The key information obtained in step S3 Input a fully connected layer to obtain an output vector of dimension p, where p represents the number of different sentiment polarities included in the sentiment analysis task. For example, if the task includes positive, neutral, and negative sentiment polarities, then p is 3. The specific expression is as follows:
[0022]
[0023] in, For trainable parameters, y represents the bias term; y represents the predicted sentiment polarity from the model.
[0024] The training and optimization strategy for this fine-grained sentiment analysis method based on the BERT model and attention network is as follows: the Adam optimizer is used to train the model, cross-entropy is used as the loss function during the optimization process, and L2 regularization is introduced to prevent the model from overfitting.
[0025] The fine-grained sentiment analysis method based on the BERT model and attention network of the present invention has the following advantages: By adopting the BERT model and attention mechanism, this model can extract deeper semantic information, and compared with the baseline model of fine-grained sentiment analysis, the accuracy and F1 score are improved, which proves the effectiveness of the model. Attached Figure Description
[0026] Figure 1 This is a flowchart of the steps of the present invention.
[0027] Figure 2 This is a flowchart of the present invention.
[0028] Figure 3 This is a diagram of the architecture of the present invention. Detailed Implementation
[0029] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and be able to implement it. However, the embodiments are not intended to limit the present invention.
[0030] This invention provides a fine-grained sentiment analysis method based on the BERT model and attention-focusing network for performing fine-grained sentiment analysis tasks.
[0031] Regarding comment S and its related term A, according to Figure 1 The flowchart shown illustrates the steps to analyze the emotional tendency of comment S in aspect A, classifying it as negative 0, neutral 1, or positive 2.
[0032] S1: Obtain the comment S and its related terms A. After preprocessing, obtain the word vector representation s of each word in the text. The comment S and its related terms A to be subjected to fine-grained sentiment analysis are truncated or padded and concatenated into an input text sequence of length x in the form of "[CLS]S[SEP]A[SEP]", where [CLS] is the text start symbol and [SEP] is the text separator and end symbol. Then, the BERT model is used to perform vector transformation on the input text sequence to obtain the word vector representation s of the text.
[0033] S2: Input s into the BERT neural network model for processing to obtain the semantic information H of each word vector integrated with context information. x*h ;
[0034] The word vector representation s of the text is input into the BERT model for processing, and the hidden state of the last layer of the BERT model is used as the semantic information H of each word vector integrated with the context information. x*h , where h is the number of hidden layers in the BERT model.
[0035] S3: For H x*h The system employs an attention-focusing network layer constructed based on the attention mechanism for analysis and processing to extract key information.
[0036] Step S3 specifically includes: extracting the semantic information H from the BERT model. x*h The information is extracted by an attention network layer constructed based on the attention mechanism. The specific structure of the attention network layer is as follows:
[0037] In the first layer of the attention network, H x*hAfter a linear transformation through a fully connected layer with no bias term, and then using the Sigmoid activation function, we obtain...
[0038]
[0039] In the second layer of the attention-focusing network layer After a linear transformation through a fully connected layer with no bias term, and then using the Softmax activation function, we obtain...
[0040]
[0041] In the third layer of the attention network, The semantic information H extracted by the BERT model x*h After matrix multiplication, the fully connected layer with unbiased input terms undergoes a linear transformation, followed by the Tanh activation function, to obtain...
[0042] in, These are trainable parameters;
[0043] Remove Key information is obtained from a dimension with a mid-dimensional value of 1.
[0044] S4: Will Fine-grained sentiment prediction is performed by inputting a fully connected layer, yielding the sentiment analysis result y of this fine-grained sentiment analysis method based on the BERT model and attention-focusing network:
[0045] Step S4 specifically includes: processing the key information obtained in step S3. Input a fully connected layer to obtain an output vector of dimension p, where p represents the number of different sentiment polarities included in the sentiment analysis task; in this example, p is 3. The specific expression is as follows:
[0046]
[0047] in, For trainable parameters, y represents the bias term; y represents the predicted sentiment polarity from the model.
[0048] The training and optimization strategy for this fine-grained sentiment analysis method based on the BERT model and attention network is as follows: the Adam optimizer is used to train the model, cross-entropy is used as the loss function during the optimization process, and L2 regularization is introduced to prevent the model from overfitting.
[0049] Example 1:
[0050] To evaluate the rationality and effectiveness of the fine-grained sentiment analysis method based on the BERT model and attention network described in this invention, the following evaluation experiments were conducted.
[0051] This example demonstrates the evaluation of the invention on the restaurant dataset in the public dataset SemEval-2014task4. This dataset contains 3608 reviews and aspect words, including three sentiment polarities: positive, negative, and neutral.
[0052] This example uses Feature-based SVM, MGAN, RAM, BERT-PT, AEN-BERT, and BERT-SPC models as evaluation baselines, and accuracy and F1 score as evaluation metrics. The results of this evaluation experiment are shown in Table 1 below.
[0053] Table 1: Comparison of Experimental Results for Different Models
[0054]
[0055] In the restaurant dataset, the BERT-ATT-GA fine-grained sentiment analysis method based on the BERT model and attention-focusing network described in this invention shows a certain degree of improvement in metrics compared to the evaluation baseline. The accuracy reached 85.80%, and the F1 score reached 80.95%, which are 0.85% and 3.97% higher than the highest values in the evaluation baseline, respectively. This proves the rationality and effectiveness of the method. While taking into account the model efficiency, it extracts deeper semantic information and achieves better fine-grained sentiment analysis prediction results.
Claims
1. A fine-grained sentiment analysis method based on the BERT model and attention-focusing networks, characterized in that, Includes the following steps: S1: Obtain the sentence to be subjected to fine-grained sentiment analysis and its corresponding aspect words. After preprocessing, obtain the word vector representation of each word in the text. S2: Input the word vector representation sequence obtained in S1 into the BERT neural network model for processing to obtain the semantic information of each word vector after integrating with the context information; S3: The semantic information obtained in S2 is analyzed and processed using an attention-focusing network layer constructed based on the attention mechanism to extract key information; The specific method is as follows: extract the semantic information from the BERT model. The input is fed into an attention-focusing network layer constructed based on an attention mechanism for information extraction. The specific structure of the attention-focusing network layer is as follows: In the first layer of the attention-focusing network layer After a linear transformation through a fully connected layer with no bias term, and then using the Sigmoid activation function, we obtain... ; In the second layer of the attention-focusing network layer After a linear transformation through a fully connected layer with no bias term, and then using the Softmax activation function, we obtain... ; In the third layer of the attention network, Semantic information extracted by the BERT model After matrix multiplication, the fully connected layer with unbiased input terms undergoes a linear transformation, followed by the Tanh activation function, to obtain... ; in, , , These are trainable parameters; Remove Key information is obtained from a dimension with a mid-dimensional value of 1. ; S4: Input the key information extracted from S3 into the fully connected layer for fine-grained sentiment prediction and obtain the analysis results.
2. The fine-grained sentiment analysis method based on the BERT model and attention-focusing network according to claim 1, characterized in that, In S1, the process of obtaining the word vector representation of each word in the text is as follows: the sentence to be analyzed and its corresponding aspect words are concatenated into an input text sequence in the form of "[CLS] + sentence to be analyzed + [SEP] + aspect word + [SEP]", where [CLS] is used as the text start symbol and [SEP] is used as the text separator and end symbol; then the BERT model is used to analyze the text of length 1. The input text sequence is transformed into a vector representation to obtain the word vector representation of each word in the text. .
3. The fine-grained sentiment analysis method based on the BERT model and attention-focusing network according to claim 1, characterized in that, In S2, the specific method is as follows: Represent the word vector of each word in the text. The input is fed into the BERT model for processing, and the hidden state of the last layer of the BERT model is obtained as the semantic information of each word vector after integrating it with the context information. ; This represents the number of hidden layers in the BERT model.
4. The fine-grained sentiment analysis method based on the BERT model and attention-focusing network according to claim 1, characterized in that, In step S4, the specific method is as follows: The key information obtained in step S3... Input a fully connected layer and obtain the dimension as follows: The output vector of , where This represents the number of different sentiment polarities included in the sentiment analysis task, and its specific expression is as follows: in, For trainable parameters, For bias terms; This represents the predicted sentiment polarity from the model.
5. The fine-grained sentiment analysis method based on the BERT model and attention-focusing network according to claim 1, characterized in that, The strategy for optimizing model training is to use the Adam optimizer to train the model, use cross-entropy as the loss function during the optimization process, and introduce L2 regularization to prevent the model from overfitting.