Emotion classifying method fusing intrinsic feature and shallow feature

A technology of emotion classification and deep features, applied in the field of emotion classification, can solve problems such as ignoring semantic relations, achieve the effect of improving classification performance and increasing accuracy

Active Publication Date: 2016-08-03
1 Cites 48 Cited by

AI-Extracted Technical Summary

Problems solved by technology

When using the TF-IDF method to represent text features, each dimension of text features represents a fixed word in the te...
View more


The invention discloses an emotion classifying method fusing an intrinsic feature and a shallow feature. The emotion classifying method is characterized in that the intrinsic feature of fusion Doc2vec and the shallow feature of TF-IDF are used for representing features of a text. By adopting a fusion method, the problem of unclear expression of a fixed word feature in the Doc2vec is solved, the problem that semantics among words is not considered in the TF-IDF method is also solved, and the expression of a text vector specific to the text is clearer. An SVM classifying method is adopted, so that better classifying performance of a classifier is achieved. The method is used for solving an emotion classifying problem, so that the emotion classifying accuracy can be improved remarkably.

Application Domain

Special data processing applicationsText database clustering/classification

Technology Topic

Pattern recognitionSemantics +1


  • Emotion classifying method fusing intrinsic feature and shallow feature
  • Emotion classifying method fusing intrinsic feature and shallow feature
  • Emotion classifying method fusing intrinsic feature and shallow feature


  • Experimental program(1)

Example Embodiment

[0025] The present invention will be further explained below in conjunction with the drawings:
[0026] Such as figure 1 As shown, the specific steps of the emotion classification method that combines deep and shallow features of the present invention are:
[0027] Step 1: Collect sentiment text corpus from the Internet, and manually mark the categories. For example, the text label of positive emotion is 1 and the text label of negative emotion is 2. And remove the space at the beginning and end of the text, and express the data in the text as a sentence, which is convenient for subsequent processing. And the corpus is divided into training set and test set. The training set is used to train the emotion classification model, and the test set is used to test the effect of the model classification.
[0028] Step 2: First collect sentiment dictionaries from the Internet. The sentiment dictionary is a basic resource for text sentiment analysis, which is actually a collection of sentiment words. Broadly speaking, it refers to a phrase or sentence that contains emotional orientation; in a narrow sense, it refers to a collection of emotionally inclined words. Sentiment dictionary generally contains two parts, a dictionary of positive emotion words and a dictionary of negative emotion words.
[0029] Then the Chinese word segmentation is performed on the corpus in step 1. The word segmentation method used in this article is a Chinese word segmentation algorithm based on the combination of dictionary reverse maximum matching algorithm and statistical word segmentation strategy. The word segmentation dictionary is constructed hierarchically, and the word segmentation dictionary collection is composed of two parts: the core dictionary and the temporary dictionary. Count the authoritative corpus of entries, and use the secondary hash structure to store the core dictionary. The sentiment dictionary is selected as the corpus loaded in the temporary dictionary. After the initial formation of the word segmentation dictionary, the word segmentation system enters the stage of autonomous learning. When the emotional text is segmented, if there are newly counted words in the temporary dictionary, the word frequency of the word is increased by one, otherwise the new word is added to the temporary dictionary. After accumulating the word frequency, it is judged whether the word frequency meets the set threshold. If it is satisfied, it will be moved to the core dictionary and the entry will be cleared in the temporary dictionary. Count and record the number of learned emotional texts. If it is greater than a predetermined value, the temporary dictionary is cleared. The entries in the updated core dictionary are used as the basis for word segmentation, and the inverse maximum matching algorithm is used to segment the emotional text.
[0030] After word segmentation, each text is a text corpus composed of words separated by spaces. Then collect the stop vocabulary list, manually delete the words useful to the experiment in the stop vocabulary list, and remove the stop words in the corpus after the word segmentation according to the stop vocabulary list. Stop words are removed to save storage space and improve efficiency.
[0031] Step 3: Use regular expressions to extract tags, nouns, adverbs, adjectives and prepositions from the corpus obtained in Step 2 to form a new corpus. If the text is too large, it is easy to cause the disaster of dimensionality when expressed as a feature vector. Extracting part of the important words in the text can better represent the text and solve the problem of dimensionality.
[0032] Step 4: Use Doc2vec to train the word vector model on the corpus in step 2 and obtain the deep feature vector of the emotional text. Doc2vec is a shallow model used to obtain the deep features of words and texts. It not only takes into account the semantic relationship between words, but also takes into account the order between words, which can well represent the characteristics of words and text. . Doc2vec uses two important models-PV-DBOW and PV-DM models. For the PV-DBOW and PV-DM models, it also gives two sets of algorithms-HierarchicalSoftmax and NegativeSampling. This paper uses the PV-DM model based on HierarchicalSoftmax algorithm. The input of the PV-DM model is a variable-length paragraph (ParagraphId) and all the words (Words) in the paragraph. The ParagraphId in this article represents emotional text. The output is the word predicted based on ParagraphId and Words.
[0033] The training process of PV-DM model:
[0034] Map each ParagraphId and Words into a unique paragraph vector (ParagraphVector) and a unique word vector (WordVector), and put all ParagraphVectors into matrix D and all WordVectors into matrix W by column. Accumulate or connect ParagraphVector and WordVector as the input of Softmax in the output layer. The output layer Softmax uses the entries in ParagraphId as leaf nodes, and the number of times the entries appear in the text corpus as the weights to construct a Huffman tree. Establish the objective function:
[0035] 1 T X t = k T - k log p ( w t | w t - k , ... , w t + k ) - - - ( 1 )
[0036] Where T represents the number of word vectors, w t , W t-k Etc. represent each word vector.
[0037] p ( w t | , w t - k , ... , w t + k ) = e y w t X i e y i - - - ( 2 )
[0038] Every y i Is the unnormalized log probability of each word vector i, y i The calculation formula is:
[0039] y=b+Uh(w t-k ,...,w t+k;W,D)(3)
[0040] Among them, U and b are the parameters of Softmax, and h is formed by the accumulation or concatenation of ParagraphVector and WordVector extracted from the D and W matrices.
[0041] During the training process, the ParagraphId remains unchanged, and all words in the text share the same ParagraphVector, which is equivalent to using the semantics of the entire text every time the probability of a word is predicted. The objective function is optimized to obtain the optimal vector representation of the word. Use the stochastic gradient ascent method to optimize the objective function of the above formula, and obtain the vector θ of the word u in the iterative process u The update formula is:
[0042] θ u : = θ u + η [ L x ( u ) - σ ( w ( x ~ ) T θ u ) ] w ( x ~ ) - - - ( 4 )
[0043] The update formula is:
[0044] θ u ∈R n Represents an auxiliary vector corresponding to word u, L x (u) represents the label of the word u, Express word The corresponding vector, σ is a logistic regression function, Express word The label of η represents the learning rate. The vector θ of word u in the iterative process u And words Vector of All have been updated on the original basis, making the vector's ability to express words stronger, the vector continues to evolve with the update, and the quality of the vector's representation is also improved.
[0045] In the prediction phase, a ParagraphId is reassigned to the text to be predicted, the word vector and the parameters of the output layer Softmax keep the parameters obtained in the training phase unchanged, and the random gradient ascent method is used to train the text to be predicted. After convergence, the ParagraphVector of the text is finally obtained, which is the deep feature vector of the text, and these deep feature vectors are processed into a data format that can use SVM.
[0046] Step 5: Use TF-IDF to train the corpus obtained in step 3 and obtain the shallow feature vector of the emotional text.
[0047] In a given sentiment text, term frequency (TF) refers to the frequency of a given word in the text. This number is a normalization of the termcount to prevent it from being biased towards longer text. (The same word may have a higher number of words in a long text than in a short text, regardless of whether the word is important or not.) For a word in a particular document t i In other words, its importance can be expressed as:
[0048] tf i , j = n i , j X k n k , j - - - ( 6 )
[0049] Where n i,j Indicates that the word is in the text d j The number of occurrences in d j The sum of the occurrences of all words in.
[0050] Inverse document frequency (IDF) is a measure of the universal importance of words. The IDF of a particular word can be obtained by dividing the total number of texts by the number of texts containing the word, and then taking the logarithm of the obtained quotient:
[0051] idf i = l o g | D | | { j : t i A d j } | - - - ( 7 )
[0052] Where |D| represents the total number of texts in the sentiment corpus, |{j:t i ∈ d j }| means the word t is included i If the word is not in the corpus, it will cause the dividend to be zero, so in general, use 1+|{j:t i ∈ d j }|, finally get the TF-IDF value of a word:
[0053] tfidf i,j =tf i,j ×idf i (8)
[0054] Calculate all the words in an emotional text, and put the obtained TF-IDF value into a new text to get the shallow feature vector of the text. Then calculate the shallow feature vectors of all texts.
[0055] Step 6: Put the deep feature vectors of all the texts obtained in step 4 into one text, each line represents a text vector, and also put the shallow feature vectors of all the texts obtained in step 5 into one text , Each line also represents a text vector. Since the deep features obtained in step 4 and the shallow features obtained in step 5 are equally important in sentiment classification, the weight ratio of the two features is set to 1:1, and the two Each line of the text is directly connected end to end to obtain a new emotional text feature vector.
[0056] Step 7: Input the text feature vector of the training set in the corpus in step 6 into the SVM to train an emotion classification model.
[0057] Introduce the nonlinear function φ(x), and put the input space R n Map to the m-dimensional feature space, and then construct a boundary hyperplane in the high-dimensional space. The hyperplane can be defined as follows:
[0058] X j = 1 m w j * φ ( x ) + b * = 0 - - - ( 9 )
[0059] Where w j * Is the weight connecting the feature space to the output space, b * Is the offset value.
[0060] In order to obtain the optimal hyperplane, the weight vector and the offset value should be minimized and meet the constraints: y i (wx i +b)≥1-ξ i ,i=1,2,...,m, where, ξ i It is a positive slack variable, which increases the fault tolerance of the slack variable. According to the principle of structural risk minimization, the objective function of minimization at this time is:
[0061] J ( w , ξ ) = 1 2 | | w | | 2 + C X j = 1 N ξ j - - - ( 10 )
[0062] Where C is the penalty parameter. According to Lagrange's theorem, the Lagrange multiplier α is introduced i , The kernel function K(x i ,x)=φ(x i )φ(x) can be transformed into solving the minimum value of the following objective function:
[0063] W ( α ) = 1 2 X i = 1 N X j = 1 N α i α j y i y j K ( x i , x j ) - X i = 1 N α i - - - ( 11 )
[0064] Which meets the constraints:
[0065] The optimal hyperplane can be expressed as:
[0066] X i = 1 N α i * y i K ( x i , x ) + b * = 0 - - - ( 12 )
[0067] The classification decision function can be expressed as:
[0068] f ( x ) = s i g n ( X i = 1 N α i * y i K ( x i , x ) + b * ) - - - ( 13 )
[0069] After the training is completed, save the emotion classification model.
[0070] Step 8: Input the text feature vector of the test set in the corpus in step 6 into SVM, and classify the emotion category according to the model that has been trained in step 7. If the label of the actual output text is equal to 1, it is determined that the text is positive Emotion, if the label of the actual output text is not equal to 1 (that is, the label is equal to 2), it is determined that the text represents negative emotions, and the number of differences between the label of the actual output text and the label of the expected output text is counted, and the emotion classification is calculated Accuracy.
[0071] The above embodiments should be understood as only used to illustrate the present invention and not to limit the protection scope of the present invention. After reading the recorded content of the present invention, technical personnel can make various changes or modifications to the present invention, and these equivalent changes and modifications also fall within the scope defined by the claims of the present invention.


no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Airborne multifunctional radar working mode identification method based on improved K-nearest neighbor

PendingCN113608172Aclear boundariesImprove classification performance

Intestinal lesion auxiliary diagnosis method based on non-normalized depth residual error and attention mechanism

PendingCN113256561AOvercoming large differences in shapeImprove classification performance

Risk population prediction method, device, terminal equipment and storage medium

PendingCN113436682AImprove classification performanceimprove accuracy

Feature selection algorithm for particle swarm hybrid optimization in combination with collaborative learning strategy

PendingCN114334168AImprove classification performanceReduce sorting costs

Fine-grained image classification method and device, electronic equipment and storage medium

PendingCN114692750Agood presentation skillsImprove classification performance

Classification and recommendation of technical efficacy words

  • Improve classification performance
  • improve accuracy

Method and system for medical image automatic segmentation, apparatus and storage medium

ActiveCN108898606AReduce information processingImprove classification performance

Method for identifying human face based on LDA subspace learning

InactiveCN102129557AImprove effectivenessImprove classification performance

Classification method based on neural network and classification device thereof

InactiveCN106339718AImprove classification performance

Eye fundus image-based diabetes and related disease classification method and equipment

PendingCN111080643AImprove classification performanceThe classification result is accurate

A human activity recognition method based on sparse representation and Softmax classification

InactiveCN109086704AImprove classification performanceImprovement complexity is high

Golf club head with adjustable vibration-absorbing capacity

InactiveUS20050277485A1improve grip comfortimprove accuracy

Stent delivery system with securement and deployment accuracy

ActiveUS7473271B2improve accuracyreduces occurrence and/or severity

Method for improving an HS-DSCH transport format allocation

InactiveUS20060089104A1improve accuracyincrease benefit

Catheter systems

ActiveUS20120059255A1increase selectivityimprove accuracy

Gaming Machine And Gaming System Using Chips

ActiveUS20090075725A1improve accuracy
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products