A sentiment analysis method, apparatus, electronic device, and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By calculating word co-occurrence feature sets and optimizing text information using neural network algorithms, this method solves the problems of time-consuming, labor-intensive, and low-accuracy existing sentiment analysis methods, and achieves fast and accurate sentiment analysis.

CN116226379BActive Publication Date: 2026-06-30EAST CHINA UNIV OF SCI & TECH

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: EAST CHINA UNIV OF SCI & TECH
Filing Date: 2023-01-18
Publication Date: 2026-06-30

Application Information

Patent Timeline

18 Jan 2023

Application

30 Jun 2026

Publication

CN116226379B

IPC: G06F16/353; G06F16/334; G06F18/22; G06F18/2415; G06N3/0464; G06N3/045; G06N3/047

AI Tagging

Technology Topics

Feature set Degree of similarity

Technical Efficacy Phrases

Improve similarity accuracyHigh precision

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

An automatic screw feeder
CN224390405Uprevent stackingavoid sticking Metal working apparatus Structural engineering Machine
Accurate metering and dispensing optimization device for condiment production
CN224377118UHigh precisionEliminate adsorptionLoading/unloading
An on-chip current acquisition circuit
CN116338285BHigh precisionApplicable to multi-scenario application requirementsCurrent/voltage measurement Software engineering Hemt circuits
A sea area performance evaluation method and system based on geospatial analysis technology
CN122198328Aimprove science High precision Data processing applications Knowledge based models
A method for manufacturing surface defect-free high-purity oxygen-free copper wire based on continuous extrusion
CN122252477AIncrease productivity Shorten the production cycle Wire rod Copper wire

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing sentiment analysis methods are time-consuming and labor-intensive, require specialized terminology for specific domains, have poor classification results, have high data quality requirements, and data problems lead to inaccurate analysis, making it impossible to provide sentiment analysis quickly and accurately.

Method used

By calculating the co-occurrence similarity and semantic similarity of any two words, a word co-occurrence feature set is constructed. Then, by combining neural network algorithms and attention mechanisms to optimize text information, sentiment analysis is performed.

Benefits of technology

It provides rapid and accurate sentiment analysis, improves the accuracy of similarity between aspect words, forms a unified framework, and solves the sentiment analysis task of multiple entities and multiple aspects.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116226379B_ABST

Patent Text Reader

Abstract

This invention relates to the field of artificial intelligence and discloses a sentiment analysis method, apparatus, electronic device, and storage medium. In this invention, the co-occurrence similarity of any two words in the text to be analyzed is calculated based on their frequency of simultaneous occurrence. A word co-occurrence feature set is constructed based on the co-occurrence similarity and semantic similarity. The text information is optimized using an attention mechanism via a neural network algorithm, and sentiment analysis is performed based on the optimized text information and the word co-occurrence feature set. Through this method, all aspect words and their corresponding sentiment polarities in the text are fully extracted. Sentiment analysis targeting aspect categories and aspect words is combined, addressing the phenomenon in real-life texts where multiple entities and aspects express different emotions. This provides rapid and accurate sentiment analysis for every combination contained in the text.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments of the present invention relate to the field of artificial intelligence, and in particular to an emotion analysis method, apparatus, electronic device and storage medium. Background Technology

[0002] Ma et al. proposed using a sentiment dictionary approach for sentiment analysis tasks. This involves summarizing existing text to construct a sentiment dictionary or rules, leveraging the characteristics of words to apply to the text to be analyzed, and obtaining the corresponding sentiment polarity. The general process is as follows: first, preprocess the input data (data cleaning, stop word removal, etc.); then, segment the data; next, input different types of words into the model for training; and finally, obtain the corresponding sentiment output based on the judgment rules. Machine learning-based text sentiment analysis can also effectively solve the problem of time-consuming and labor-intensive sentiment dictionary methods.

[0003] The inventors discovered at least the following problems with related technologies: using sentiment dictionaries is time-consuming and labor-intensive, requiring the addition of specialized vocabulary for specific domains; otherwise, classification results will be poor. Machine learning-based text sentiment analysis has excessively high requirements for data quality; performance can vary significantly depending on the quality of the data. Current online comment information suffers from high feature dimensionality, significant data imbalance, and severe data gaps. A comprehensive comparison of existing methods based on sentiment dictionaries and rules, machine learning, and deep learning reveals that they cannot quickly and accurately provide sentiment analysis for information with high redundancy, broad content, and diverse formats. Summary of the Invention

[0004] The purpose of this invention is to provide a sentiment analysis method, apparatus, electronic device, and storage medium that fully extracts all aspect words and their corresponding sentiment polarities from a text, combining aspect-level sentiment analysis with aspect-word sentiment analysis to form a unified framework that simultaneously solves two aspects-level sentiment analysis tasks, providing rapid and accurate sentiment analysis.

[0005] To address the aforementioned technical problems, embodiments of the present invention provide a sentiment analysis method, comprising: acquiring a text to be analyzed; calculating the co-occurrence similarity of two words based on their frequency of simultaneous occurrence; calculating the semantic similarity of the two words; calculating the similarity weight of the two words based on the co-occurrence similarity and semantic similarity; constructing a word co-occurrence feature set based on the similarity weight; calculating text information of the text to be analyzed using a neural network algorithm; optimizing the text information using an attention mechanism; and performing sentiment analysis based on the optimized text information and the word co-occurrence feature set.

[0006] An embodiment of the present invention also provides a sentiment analysis device, comprising: a word co-occurrence feature set construction module, configured to acquire a text to be analyzed, calculate the co-occurrence similarity of two words based on the frequency of simultaneous occurrence of any two words in the text to be analyzed, calculate the semantic similarity of the two words, calculate the similarity weight of the two words based on the co-occurrence similarity and semantic similarity, and construct a word co-occurrence feature set based on the similarity weight; an attention optimization module, configured to calculate the text information of the text to be analyzed according to a neural network algorithm, and optimize the text information by an attention mechanism; and a sentiment analysis module, configured to perform sentiment analysis based on the optimized text information and the word co-occurrence feature set.

[0007] Embodiments of the present invention also provide an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the aforementioned sentiment analysis method.

[0008] A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the above-described sentiment analysis method.

[0009] In this embodiment of the invention, the text to be analyzed is acquired. Based on the frequency of any two words appearing simultaneously in the text, the co-occurrence similarity and semantic similarity of the two words are calculated. A similarity weight is calculated based on the co-occurrence and semantic similarities, and a word co-occurrence feature set is constructed based on the similarity weight. The text information of the text to be analyzed is calculated using a neural network algorithm, and the text information is optimized using an attention mechanism. Sentiment analysis is then performed based on the optimized text information and the word co-occurrence feature set. Through this method, all aspect words and their corresponding sentiment polarities in the text are fully explored. Sentiment analysis targeting aspect categories and sentiment analysis targeting aspect words are combined to form a unified framework that simultaneously solves two aspects-level sentiment analysis tasks. This provides rapid and accurate sentiment analysis for every combination of entities and aspects expressing different emotions in real-life texts.

[0010] Furthermore, calculating the semantic similarity between the two words includes: obtaining the semantics of the two words respectively through the Word2vec word vector model, and calculating the semantic similarity between the two words according to the cosine similarity calculation formula. This further improves the accuracy of similarity between aspect words, thereby improving the overall accuracy.

[0011] Furthermore, any first word in the text to be analyzed is selected, and the word corresponding to the largest similarity weight related to the first word is selected as the second word. A word co-occurrence feature group is constructed based on the first word and the second word, and a word co-occurrence feature set is constructed based on all word co-occurrence feature groups in the text to be analyzed. This further forms a unified framework to simultaneously solve two aspects of sentiment analysis tasks. The word co-occurrence feature set improves data correlation.

[0012] In addition, the text information of the text to be analyzed includes: contextual semantic information and textual aspect word representation information.

[0013] In addition, the neural network algorithm includes: BiLSTM neural network algorithm.

[0014] Furthermore, the optimization of the text information using an attention mechanism includes: transforming elements in the text information using an attention mechanism, calculating attention weights for the elements, and optimizing the text information based on the transformed elements and the attention weights. The self-attention mechanism can reduce the losses caused by high-dimensional computation and obtain an optimized representation of a specific aspect.

[0015] In addition, the sentiment analysis based on the optimized text information and the word co-occurrence feature set includes: combining the optimized text information and the word co-occurrence feature set, inputting them into a convolutional neural network model, inputting the feature vector obtained by the convolutional neural network model into a Softmax classifier for classification, and performing the sentiment analysis based on the classification result. Attached Figure Description

[0016] One or more embodiments are illustrated by way of example with reference numerals in the accompanying drawings. These illustrations do not constitute a limitation on the embodiments. Elements with the same reference numerals in the drawings are denoted as similar elements. Unless otherwise stated, the figures in the drawings are not to be limited by scale.

[0017] Figure 1 This is a flowchart of a sentiment analysis method according to an embodiment of the present invention;

[0018] Figure 2 This is a schematic diagram of a fine-grained sentiment analysis process provided according to an embodiment of the present invention;

[0019] Figure 3 This is a schematic diagram of the process for constructing a word co-occurrence feature set according to an embodiment of the present invention;

[0020] Figure 4 This is a schematic diagram of a research scheme and technical route provided according to an embodiment of the present invention;

[0021] Figure 5 This is a schematic diagram of the structure of an emotion analysis device according to another embodiment of the present invention;

[0022] Figure 6 This is a schematic diagram of the structure of an electronic device according to another embodiment of the present invention. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the various embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, those skilled in the art will understand that many technical details are presented in the various embodiments of the present invention to facilitate a better understanding of this application. However, the technical solutions claimed in this application can be implemented even without these technical details and various changes and modifications based on the following embodiments. The division of the various embodiments below is for ease of description and should not constitute any limitation on the specific implementation of the present invention. The various embodiments can be combined with and referenced by each other without contradiction.

[0024] One embodiment of the present invention relates to a sentiment analysis method applicable to terminal devices such as mobile phones and computers. In this embodiment, the method involves acquiring a text to be analyzed, calculating the co-occurrence similarity of any two words in the text based on their frequency of simultaneous occurrence, calculating the semantic similarity of the two words, calculating the similarity weight of the two words based on the co-occurrence and semantic similarities, constructing a word co-occurrence feature set based on the similarity weight, calculating the text information of the text to be analyzed using a neural network algorithm, optimizing the text information through an attention mechanism, and performing sentiment analysis based on the optimized text information and the word co-occurrence feature set. Through this method, all aspect words and their corresponding sentiment polarities in the text are fully extracted, combining sentiment analysis based on aspect categories and sentiment analysis based on aspect words to form a unified framework that simultaneously solves two aspect-level sentiment analysis tasks. This approach addresses the phenomenon in real-life texts where multiple entities and aspects express different emotions, outputting the sentiment for every combination contained in the text, providing rapid and accurate sentiment analysis. The following is a detailed description of the implementation details of the sentiment analysis method in this embodiment. The following content is only for the convenience of understanding and is not necessary for implementing this solution.

[0025] like Figure 1As shown, in step 101, the text to be analyzed is obtained so that the co-occurrence similarity of two words can be calculated in step 102. "Co-occurrence" refers to the phenomenon that certain words always appear together in the text. "Co-occurrence analysis" can uncover hidden information about the association between features and content in the text. By statistically analyzing the co-occurrence of noun pairs in the text, the strength of the association between these words can be reflected. In fact, the fine-grained sentiment analysis process is illustrated as follows: Figure 2 As shown, for example, "The restaurant's environment is terrible, but the food is delicious." The first step is to extract aspect words (restaurant and food) from the sentence. The second step is to capture the sentiment polarity of the corresponding aspect words. For a given aspect category "restaurant," the sentiment polarity is negative, while for a given aspect category "food" in the sentence, it is positive. In this example, co-occurrence analysis involves statistically analyzing the co-occurrence of noun pairs in the text. For example, in restaurant reviews, "food" always appears alongside words like "delicious," "bad," and "average." Statistical analysis of the frequency of these co-occurring words helps the model understand the text data and improves classification performance. This application uses word frequency (Word Frequency) to measure the degree of co-occurrence association between words in a document, based on the frequency of any two words appearing simultaneously in the text to be analyzed, as shown in the following formula:

[0026]

[0027] Among them, Similarity(w i w j ) indicates the word w i and w j Previous co-occurrence similarity, N doc (w i w j ) indicates the word w i and w j The number of simultaneous occurrences, and N doc (w i ) indicates that the word "w" appears in the document. i The number.

[0028] By removing stop words and performing word segmentation, the text data is transformed into a text-word matrix. Then, the number of all words, n, is counted to construct an n×n matrix. Finally, the co-occurrence similarity between any two words is calculated using the formula above.

[0029] In step 103, the semantic similarity between the two words is calculated. In one example, the semantic similarity between the two words is calculated by: obtaining the semantics of the two words respectively through the Word2vec word vector model. Word2vec can map high-dimensional words to a low-dimensional real number space, with each dimension corresponding to a layer of semantics of the word, and then calculating the semantic similarity between the two words according to the cosine similarity calculation formula.

[0030] The formula for calculating cosine similarity is as follows:

[0031]

[0032] Where, d semantic (w i w j ) for w i and w i The cosine similarity is used as the semantic similarity.

[0033] In step 104, the similarity weights of the two words are calculated. This calculation, based on the co-occurrence similarity and semantic similarity, primarily uses semantic similarity as the weight, adding it to the co-occurrence similarity to form the weight of the co-occurrence matrix. ij This is used to measure the degree of connection between words. The formula is as follows:

[0034] Weight(w ij )=d semantic (w i w j )×Similarity(w i w j )

[0035] In step 105, a word co-occurrence feature set is constructed.

[0036] In one example, constructing a word co-occurrence feature set based on the similarity weight includes: selecting any first word in the text to be analyzed, and selecting the word corresponding to the largest similarity weight related to the first word as the second word; constructing a word co-occurrence feature group based on the first word and the second word; and constructing the word co-occurrence feature set based on all the word co-occurrence feature groups in the text to be analyzed.

[0037] Since an n*n matrix has already been constructed, and based on the word co-occurrence matrix built from the dataset, for any word w... i We can find the co-occurrence matrix with weights w of all candidate words. ij The highest word w j and (w i w j As word co-occurrence features of the text, all the word co-occurrence feature groups construct the word co-occurrence feature set.

[0038] In one example, the process of constructing a word co-occurrence feature set is illustrated as follows: Figure 3As shown, the text sequence is processed through stop word removal and word segmentation to determine the analysis objects, such as word expansion, high-frequency words, emerging words, and topic words. When constructing the co-occurrence matrix, TF-IDF, TextRank, K-meams algorithm analysis, factor analysis, and other operations are performed to calculate its co-occurrence similarity. After calculating semantic similarity, dependency analysis is performed to generate a text-feature set (word co-occurrence feature set).

[0039] In step 106, the text information of the text to be analyzed is calculated according to a neural network algorithm.

[0040] In one example, the text information of the text to be analyzed includes: contextual semantic information and textual aspect word representation information.

[0041] In one example, the neural network algorithm includes: the BiLSTM neural network algorithm.

[0042] By inputting the vector matrix into a BiLSTM, which is an LSTM neural network with two input texts in opposite directions, text information can be obtained from two opposing directions, yielding semantic information of the context and word representation information of the text. A key feature of LSTM is its three special gate functions: input gate, forget gate, and output gate. These functions allow the neural network to discard its memory and selectively retain effective information.

[0043] The calculation of input information by neurons in LSTM is shown in the following formula:

[0044] f t =σ(W f ·[h t-1 x t ]+b f )

[0045] i t =σ(W i ·[h t-1 x t ]+b i )

[0046]

[0047]

[0048] o t =σ(W O ·[h t-1 x t ]+b o )

[0049] h t =o t *Tanh(Ct )

[0050] Among them W t Given the input at time t, h t-1 Let W be the hidden state at time t-1. f W i W C W O Let b be the weight matrix of the LSTM. f b i b c b o This is the offset of the LSTM.

[0051] Since the text information obtained through neural network algorithms, such as semantic information of context and word representation information of text, lacks specific attention to the sentence, an attention mechanism is used to optimize the text information in step 107.

[0052] In one example, optimizing the text information using an attention mechanism includes: transforming elements in the text information using an attention mechanism, calculating the attention weight of the elements, and optimizing the text information based on the transformed elements and the attention weights.

[0053] To obtain a representation of a specific aspect within the sentence information, this representation is fed into an attention mechanism, resulting in an optimized representation of that specific aspect. To enable the attention mechanism to optimize the input sequence, based on a specific element (Query) in the target sentence and the input sequence (Source), the attention mechanism transforms each element in the input sequence into...<Key,Value> The input sequence is in the form of a key, where the key is the value. By calculating the similarity, the attention weights for each input sequence and key can be obtained. Multiplying these weights by the values corresponding to different keys yields the output of the attention mechanism. This reduces the loss caused by high-dimensional computation. The calculation formula is shown below:

[0054]

[0055] Here, softmax() is the normalization exponential function, Q is the query vector of the word, K is the "queried" vector, and V is the content vector. When K = Q = V, it is a self-attention mechanism.

[0056] In step 108, sentiment analysis is performed based on the optimized text information and the word co-occurrence feature set.

[0057] In one example, the sentiment analysis based on the optimized text information and the word co-occurrence feature set includes: combining the optimized text information and the word co-occurrence feature set, inputting them into a convolutional neural network (CNN) model, inputting the feature vector obtained by the CNN model into a Softmax classifier for classification, and performing the sentiment analysis based on the classification result.

[0058] In one example, the specific research plan and overall technical route for the specific research content are as follows: Figure 4 As shown, the text sequence undergoes a series of steps during text feature extraction, including identifying the analysis object, constructing a co-occurrence matrix, and performing co-occurrence analysis to obtain a word co-occurrence feature set. This co-occurrence feature set is then input into a modified multi-feature fusion neural network. The semantics of the two words are obtained through a Word2vec word vector model and input into a BiLSTM neural network. Contextual semantic information and textual aspect word representation information are selected, and the text information is optimized using a multi-head attention mechanism. The optimized text information and the word co-occurrence feature set are then combined and input into a CNN model. The feature vectors obtained through the CNN model are then input into a Softmax classifier for classification. Aspect-level sentiment analysis is performed based on the classification results.

[0059] To verify or evaluate the performance of the algorithm and model in this application, tests were conducted on the publicly available dataset SemEval2014 Task 4. SemEval2014 Task 4 includes two datasets: Laptop and Restaurant. The data samples in both datasets include four sentiment polarities: positive, negative, neutral, and conflicting. This paper removes conflicting samples and considers only the positive, negative, and neutral polarities. Experimental data statistics are shown in Table 1.

[0060]

[0061] Table 1. Statistical Analysis of Experimental Data

[0062] This application uses accuracy as the performance indicator of the experiment, and its calculation formula is as follows:

[0063]

[0064]

[0065]

[0066] Among them, TP (True Positive) means that the sample is positive and the prediction result is also positive, TN (True Negative) means that the sample is negative and the prediction result is negative, FN (False Negative) means that the sample is positive but the prediction result is negative, and FP (False Positive) means that the sample is negative but the prediction result is positive.

[0067] This paper can also use information from other neural network models for calculation. Based on existing literature, the following models will be used for comparative experiments. The experimental results are shown in Table 2.

[0068] (1) LSTM: Using a single LSTM to concatenate the word vectors of the aspect words and the text word vectors to obtain the hidden layer state, and finally inputting the hidden layer vector into the classifier to obtain the classification result.

[0069] (2) CNN: Using a regular CNN model, the word vectors of the aspect words and the text word vectors are concatenated and input into the CNN. After convolution, pooling and other operations, the input is fed into the classifier to obtain the classification result.

[0070] (3) ATAE-LSTM: The word vectors of specific aspects and the word vectors of context are concatenated and input into the LSTM as a fusion vector to obtain the hidden layer state. An attention mechanism is introduced to assign greater weights to important features, and finally the result is fed into the classifier.

[0071] (4) GCAE: This model inputs aspect words and contextual information into the CNN and introduces a gating mechanism to selectively output sentiment features based on different aspect words, thereby optimizing the classification effect.

[0072] (5) IAN: Two LSTMs are used to model the aspect words and the text respectively, and two hidden layer states are obtained. The average value is calculated, and the relationship between aspect words and context is learned by using the attention mechanism. Finally, the data is input into the classifier for sentiment classification.

[0073] (6) RAM: Input the text into BiLSTM for encoding, and combine the resulting hidden layer state with the attention mechanism to capture long-range sentiment features in the text for prediction.

[0074]

[0075] Table 2 Experimental results of different models

[0076] As shown above, the models proposed in this paper outperform other baseline models on both datasets. Furthermore, the IAN and ATAE-LATM models, which introduce attention mechanisms on top of LSTM, outperform the simple LSTM model, demonstrating that the attention mechanism can improve the model's classification performance. However, these two model examples do not fully extract local features of the text and do not utilize the advantages of CNN models, thus limiting their classification capabilities. Therefore, this paper analyzes and verifies the impact of the number of CNN layers on aspect-level sentiment analysis performance. The experimental results are shown in Table 3.

[0077]

[0078] Table 3. Impact of CNN Layer Number on Accuracy

[0079] As shown in Table 3, the model performs best when the CNN has 2 layers. When the CNN has 1 layer, the poor performance may be due to insufficient extraction of local feature information. When the CNN has 3 layers, the model becomes more complex with more parameters, increasing the likelihood of overfitting.

[0080] In this embodiment, the text to be analyzed is acquired. Based on the frequency of any two words appearing simultaneously in the text, the co-occurrence similarity and semantic similarity of the two words are calculated. A similarity weight is calculated based on the co-occurrence and semantic similarities, and a word co-occurrence feature set is constructed based on the similarity weight. The text information of the text to be analyzed is calculated using a neural network algorithm, and the text information is optimized using an attention mechanism. Sentiment analysis is then performed based on the optimized text information and the word co-occurrence feature set. Through this method, all aspect words and their corresponding sentiment polarities in the text are fully explored. Sentiment analysis targeting aspect categories and sentiment analysis targeting aspect words are combined to form a unified framework that simultaneously solves two aspects-level sentiment analysis tasks. This approach addresses the phenomenon in real-life texts where multiple entities and aspects express different emotions, outputting the sentiment for every combination contained in the text, providing rapid and accurate sentiment analysis.

[0081] The steps described above are for clarity only. In practice, they can be combined into one step or some steps can be broken down into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but without changing the core design of the algorithm and process, are also within the scope of protection of this patent.

[0082] Another embodiment of the present invention relates to an emotion analysis device, such as... Figure 5 As shown, it includes:

[0083] The word co-occurrence feature set construction module 501 is used to acquire the text to be analyzed, calculate the co-occurrence similarity of the two words based on the frequency of any two words appearing at the same time in the text to be analyzed, calculate the semantic similarity of the two words, calculate the similarity weight of the two words based on the co-occurrence similarity and semantic similarity, and construct the word co-occurrence feature set based on the similarity weight.

[0084] The attention optimization module 502 is used to calculate the text information of the text to be analyzed according to the neural network algorithm, and optimize the text information by the attention mechanism;

[0085] The sentiment analysis module 503 is used to perform sentiment analysis based on the optimized text information and the word co-occurrence feature set.

[0086] In one example, calculating the semantic similarity between the two words includes: obtaining the semantics of the two words respectively through the Word2vec word vector model, and calculating the semantic similarity between the two words according to the cosine similarity calculation formula.

[0087] In one example, constructing a word co-occurrence feature set based on the similarity weight includes: selecting any first word in the text to be analyzed, and selecting the word corresponding to the largest similarity weight related to the first word as the second word; constructing a word co-occurrence feature group based on the first word and the second word; and constructing the word co-occurrence feature set based on all the word co-occurrence feature groups in the text to be analyzed.

[0088] In one example, the text information of the text to be analyzed includes: contextual semantic information and textual aspect word representation information.

[0089] In one example, the neural network algorithm includes: the BiLSTM neural network algorithm.

[0090] In one example, optimizing the text information using an attention mechanism includes: transforming elements in the text information using an attention mechanism, calculating the attention weight of the elements, and optimizing the text information based on the transformed elements and the attention weights.

[0091] In one example, the sentiment analysis based on the optimized text information and the word co-occurrence feature set includes: combining the optimized text information and the word co-occurrence feature set, inputting them into a CNN model, inputting the feature vector obtained by the CNN model into a Softmax classifier for classification, and performing the sentiment analysis based on the classification result.

[0092] In this embodiment, the text to be analyzed is acquired. Based on the frequency of any two words appearing simultaneously in the text, the co-occurrence similarity and semantic similarity of the two words are calculated. A similarity weight is calculated based on the co-occurrence and semantic similarities, and a word co-occurrence feature set is constructed based on the similarity weight. The text information of the text to be analyzed is calculated using a neural network algorithm, and the text information is optimized using an attention mechanism. Sentiment analysis is then performed based on the optimized text information and the word co-occurrence feature set. Through this method, all aspect words and their corresponding sentiment polarities in the text are fully explored. Sentiment analysis targeting aspect categories and sentiment analysis targeting aspect words are combined to form a unified framework that simultaneously solves two aspects-level sentiment analysis tasks. This approach addresses the phenomenon in real-life texts where multiple entities and aspects express different emotions, outputting the sentiment for every combination contained in the text, providing rapid and accurate sentiment analysis.

[0093] It is not difficult to see that this embodiment is a device embodiment corresponding to the above method embodiment, and this embodiment can be implemented in conjunction with the above method embodiment. The relevant technical details mentioned in the above method embodiment are still valid in this embodiment, and will not be repeated here to reduce repetition. Accordingly, the relevant technical details mentioned in this embodiment can also be applied to the above method embodiment.

[0094] It is worth mentioning that all modules involved in this embodiment are logical modules. In practical applications, a logical unit can be a physical unit, a part of a physical unit, or a combination of multiple physical units. Furthermore, to highlight the innovative aspects of this invention, this embodiment does not introduce units that are not closely related to solving the technical problem proposed by this invention; however, this does not mean that other units are absent from this embodiment.

[0095] Another embodiment of the present invention relates to an electronic device, such as Figure 6 As shown, it includes at least one processor 601; and a memory 602 communicatively connected to at least one processor 601; wherein the memory 602 stores instructions executable by at least one processor 601, the instructions being executed by at least one processor 601 to enable at least one processor 601 to perform the sentiment analysis method as described above.

[0096] The memory 602 and processor 601 are connected via a bus, which may include any number of interconnecting buses and bridges. The bus connects various circuits of one or more processors 601 and memory 602 together. The bus can also connect various other circuits, such as peripheral devices, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver can be a single element or multiple elements, such as multiple receivers and transmitters, providing a unit for communicating with various other devices over a transmission medium. Data processed by processor 601 is transmitted over a wireless medium via an antenna, which further receives data and transmits it to processor 601.

[0097] Processor 601 is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management, and other control functions. Memory 602 can be used to store data used by the processor during operation.

[0098] Another embodiment of the present invention relates to a computer-readable storage medium storing a computer program. When executed by a processor, the computer program implements the method embodiments described above.

[0099] That is, those skilled in the art will understand that all or part of the steps in the methods of the above embodiments can be implemented by a program instructing related hardware. This program is stored in a storage medium and includes several instructions to cause a device (which may be a microcontroller, chip, etc.) or processor to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as a USB flash drive, a portable hard drive, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.

[0100] Those skilled in the art will understand that the above embodiments are specific examples of implementing the present invention, and in practical applications, various changes in form and detail may be made without departing from the spirit and scope of the present invention.

Claims

1. A sentiment analysis method, characterized in that, include: Obtain the text to be analyzed, calculate the co-occurrence similarity of the two words based on the frequency of their simultaneous occurrence in the text, calculate the semantic similarity of the two words, calculate the similarity weight of the two words based on the co-occurrence similarity and semantic similarity, and construct a word co-occurrence feature set based on the similarity weight; The text information of the text to be analyzed is calculated according to the neural network algorithm, the text information is optimized through the attention mechanism, and sentiment analysis is performed based on the optimized text information and the word co-occurrence feature set. The step of constructing a word co-occurrence feature set based on the similarity weights includes: Select any first word in the text to be analyzed, and select the word corresponding to the largest similarity weight related to the first word as the second word. Construct word co-occurrence feature groups based on the first word and the second word, and construct word co-occurrence feature sets based on all word co-occurrence feature groups in the text to be analyzed. The step of performing sentiment analysis based on the optimized text information and the word co-occurrence feature set includes: The optimized text information and the word co-occurrence feature set are combined and then input into a convolutional neural network model. The feature vector obtained by the convolutional neural network model is then input into a Softmax classifier for classification, and the sentiment analysis is performed based on the classification results.

2. The sentiment analysis method according to claim 1, characterized in that, The calculation of the semantic similarity between the two words includes: The semantics of the two words are obtained using the Word2vec word vector model, and the semantic similarity of the two words is calculated according to the cosine similarity formula.

3. The sentiment analysis method according to claim 1, characterized in that, The text information of the text to be analyzed includes: Contextual semantic information and textual word representation information.

4. The sentiment analysis method according to claim 1, characterized in that, The neural network algorithm includes: BiLSTM neural network algorithm.

5. The sentiment analysis method according to claim 1, characterized in that, The optimization of the text information through an attention mechanism includes: The elements in the text information are transformed using an attention mechanism, and the attention weight of the elements is calculated. The text information is then optimized based on the transformed elements and the attention weights.

6. An emotion analysis device, characterized in that, include: The word co-occurrence feature set construction module is used to acquire the text to be analyzed, calculate the co-occurrence similarity of the two words based on the frequency of any two words appearing at the same time in the text, calculate the semantic similarity of the two words, calculate the similarity weight of the two words based on the co-occurrence similarity and semantic similarity, and construct the word co-occurrence feature set based on the similarity weight. The attention optimization module is used to calculate the text information of the text to be analyzed according to the neural network algorithm, and optimize the text information by the attention mechanism; The sentiment analysis module is used to perform sentiment analysis based on the optimized text information and the word co-occurrence feature set; The step of constructing a word co-occurrence feature set based on the similarity weights includes: Select any first word in the text to be analyzed, and select the word corresponding to the largest similarity weight related to the first word as the second word. Construct word co-occurrence feature groups based on the first word and the second word, and construct word co-occurrence feature sets based on all word co-occurrence feature groups in the text to be analyzed. The step of performing sentiment analysis based on the optimized text information and the word co-occurrence feature set includes: The optimized text information and the word co-occurrence feature set are combined and then input into a convolutional neural network model. The feature vector obtained by the convolutional neural network model is then input into a Softmax classifier for classification, and the sentiment analysis is performed based on the classification results.

7. An electronic device, characterized in that, include: At least one processor; as well as, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the sentiment analysis method as described in any one of claims 1 to 5.

8. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the sentiment analysis method according to any one of claims 1 to 5.