Attention dual-layer LSTM-based long text emotional tendency analysis method

A technology of sentiment tendency and analysis method, applied in the direction of text database clustering/classification, semantic analysis, unstructured text data retrieval, etc. Problems such as length of space, to avoid defects and improve the accuracy of the effect

Inactive Publication Date: 2018-08-24
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF3 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The purpose of the present invention is to solve the problem that the length of long text reviews is long, the positive and negative emotional features are discretely distributed, and

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention dual-layer LSTM-based long text emotional tendency analysis method
  • Attention dual-layer LSTM-based long text emotional tendency analysis method
  • Attention dual-layer LSTM-based long text emotional tendency analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to better illustrate the purpose and advantages of the present invention, the implementation of the method of the present invention will be further described in detail below in conjunction with examples.

[0031] The experimental data comes from foreign long text reviews, including the Internet Movie Database IMDb and the hotel review corpus Yelp2015. Split the training set and test set at a ratio of 4:1. The experimental data of long text sentiment analysis is shown in Table 1.

[0032] Table 1. Long text sentiment analysis experimental data (articles)

[0033]

[0034] Among them, #s / d represents the average number of sentences in each document, and #w / d represents the average number of words in each document.

[0035] During the experiment, the word embedding vocabulary is set to 400,000, and 100 is used as the word embedding dimension. Each document contains at most 18 sentences, and a single sentence contains at most 100 words. Model parameters: use ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an attention dual-layer LSTM-based long text emotional tendency analysis method, belongs to the field of natural language processing and machine learning, and mainly aims to solve the problem of difficulty in accurately judging an emotional tendency of a full text due to long comment length of the long text, discrete distribution of positive and negative emotional featuresand different emotional semantic contribution degrees of sentences. The method comprises the steps of firstly learning sentence-level emotional vector representation by utilizing LSTM; secondly coding semantic relationships between emotional semantics of all the sentences in a document and the sentences by adopting bidirectional LSTM, and based on an attention mechanism, performing weight allocation on the sentences with different emotional semantic contribution degrees; and finally, weighting the sentence-level emotional vector representation to obtain document-level emotional vector representation of the long text, and through a Softmax layer, obtaining the emotional tendency of the long text. An experiment is performed in Yelp2015 and IMDb film comment corpora; and a result shows thata relatively good classification effect can be achieved, so that the emotional classification correctness is further improved.

Description

technical field [0001] The invention relates to an attention-based double-layer LSTM method for analyzing the emotional tendency of long texts, and belongs to the fields of natural language processing and machine learning. Background technique [0002] The sentiment analysis method of long text comments has gradually begun to use deep learning methods, among which deep learning methods represented by CNN, RNN and LSTM have achieved good results in the field of sentiment analysis. [0003] 1. CNN-based method [0004] Although the CNN-based algorithm can effectively classify text, the problem of text sentiment analysis is not a simple text classification problem. Because both the training corpus and the prediction corpus are texts containing the author's emotions, it is necessary to consider the text context connection to achieve accurate discrimination of the emotional tendency of long texts. Therefore, an ideal text sentiment analysis algorithm needs to consider recording...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F17/30G06K9/62
CPCG06F16/35G06F40/30G06F18/2414
Inventor 潘丽敏白崇有罗森林毛焱颖吴舟婷
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products