Document-level sentiment analysis method based on specific domain sentiment words

A field-specific, sentiment analysis technology, applied in semantic analysis, special data processing applications, instruments, etc., can solve the problems of ignoring non-consecutive word relationships, research complexity, and poor sentiment analysis performance.

Active Publication Date: 2018-11-13
SHANDONG UNIV OF SCI & TECH
View PDF8 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with word-level and sentence-level, document-level sentiment analysis needs to consider the overall structure of the document (the document is composed of sentences, and the sentence is composed of words), and extracting complex features of the document makes this research complicated.
Existing research ignores the relationship between non-consecutive words, including syntactic features (phrase structures with a certain distance) and semantic features (the object referred to by “its”); some studies do not make full use of prior knowledge such as sentiment lexicons , to enrich the emotional features of documents, however, sentiment dictionaries play an important role in sentiment analysis tasks. Sentiment words are an important basis for sentiment analysis. It is necessary to establish an accurate and high-coverage sentiment dictionary, but compared with domain-specific , these sentiment dictionaries have poor performance for sentiment analysis, because domain-specific sentiment terms may not appear in general dictionaries, and the same term has different meanings in different situations in specific domains and general dictionaries, therefore, It is necessary to construct a domain-specific sentiment dictionary
Summarizing previous studies, it is found that when modeling documents, usually only the document representation or sentiment dictionary is considered, and the two are not combined, making the extracted document features relatively simple

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document-level sentiment analysis method based on specific domain sentiment words
  • Document-level sentiment analysis method based on specific domain sentiment words
  • Document-level sentiment analysis method based on specific domain sentiment words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0128] The implementation process of the present invention is described in detail by using two data sets in the fields of movies and restaurants, and performing document-level sentiment analysis based on field-specific sentiment words.

[0129] The data set used in this method comes from a paper: Sentiment Analysis of Document Modeling Using Gated Recurrent Neural Networks. The author is Tang Duyu et al. The paper was published in 2015. The data set used is shown in Table 1. .

[0130] Table 1 Dataset

[0131]

[0132] Evaluate the effectiveness of the present invention on four large-scale data sets, use 80% of the data for training, 10% of the data for verification, and the remaining 10% of the data as a development set. The evaluation standard is classification accuracy, the formula As shown in (18):

[0133]

[0134] Among them, TP is the number of positive classes predicted as positive classes, TN is the number of positive classes predicted as negative classes, FP ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a document-level sentiment analysis method based on specific domain sentiment words. The method is implemented by the following steps of collecting a document data set, traininga set of prototype words by using a Skip-gram word vector model to obtain a word vector corresponding to each prototype word, recombining the word vectors by utilizing an attention mechanism, and capturing a relation between non-continuous words in the word vectors; synthesizing the words and sentences by using an asymmetric convolutional neural network and a bidirectional gate recurrent neural network based on the attention mechanism respectively, thereby forming document vector characteristics; generating sentiment eigenvectors by utilizing a domain sentiment dictionary of the Skip-gram word vector model; and finally, combining the document vector characteristics and the sentiment eigenvectors by utilizing a linear combination layer to form document characteristics beneficial to document classification. The sentiment analysis is widely applied to the product analysis, the commodity recommendation, the stock price trend prediction and the like; and the method provided by the invention can accurately and efficiently carry out sentiment analysis on documents, and has great commercial values.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a document-level sentiment analysis method based on sentiment words in a specific field. Background technique [0002] Sentiment analysis, also known as opinion mining or opinion mining, is a fundamental task in natural language processing and statistical linguistics. Sentiment analysis is very important to understand the opinion information generated by users on social networks or product reviews, and can provide decision support for merchants and other users; in public opinion monitoring, it can keep abreast of people's attitudes towards emergencies and guide public opinion trends, etc. It has attracted extensive attention from industry and academia. Sentiment analysis is divided into word level, sentence level and document level according to granularity. Compared with word-level and sentence-level, document-level sentiment analysis needs to consider the o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06K9/62
CPCG06F40/30G06F18/24
Inventor 田刚王芳孙承爱李堂军任艳伟
Owner SHANDONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products