Sentiment word extension-based short text sentiment classifying method

A technology of emotion classification and emotion words, which is applied in special data processing applications, instruments, and electronic digital data processing. The effect of accuracy

Inactive Publication Date: 2018-08-07
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The basic problems that need to be solved in the short text sentiment classification method based on the expansion of sentiment words are: 1. Short text comments are short, and there are few sentiment feature words, which may easily lead to sparse sentiment features; 2. The existing short text sentiment analysis methods are not effective in feature dimension reduction Good; 3. The choice of classifier in the supervised sentiment analysis method has a great influence on th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sentiment word extension-based short text sentiment classifying method
  • Sentiment word extension-based short text sentiment classifying method
  • Sentiment word extension-based short text sentiment classifying method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to better illustrate the purpose and advantages of the present invention, the implementation of the method of the present invention will be further described in detail below in conjunction with examples.

[0031] The specific process is:

[0032] Step 1, preprocessing the text.

[0033] Step 1.1, segment the text into sentence sets.

[0034] Step 1.2, using the jieba word segmentation tool for word segmentation and part-of-speech tagging, the specific experimental data used is shown in Table 1:

[0035] Table 1. Experimental data of short text sentiment analysis (articles)

[0036]

[0037] Step 2, emotional feature expansion.

[0038] Step 2.1, train Glove to generate word vectors, word vectors can represent words as real value vectors, train through a given training corpus, and map text content into vector space vectors, which utilizes the idea of ​​neural networks, through vector space The cosine similarity in represents the semantic similarity of wor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a sentiment word extension-based short text sentiment classifying method, and belongs to the technical field of computers and information science. The sentiment word extension-based short text sentiment classifying method comprises the following steps: first, segmenting a comment text into a sentence set, and dividing words and labelling parts of speech by utilizing a jieba word dividing tool to obtain a pre-processed result; then, aiming at each short text comment, acquiring the word vector of each word by using Wikipedia corpus training Glove, calculating the semantic similarity of other words and the primary sentiment features with the parts of speech of N, V, Adj and Adv by utilizing the word vectors, expanding the words with similar semantics to a primary sentiment feature set; next, proposing DF-TF-MI, performing feature dimension reduction by improving a conventional feature dimension reduction method by utilizing interlexical statistical features to obtain a low-dimension feature set, and weighting the sentiment features; finally, performing sentiment tendency classification on the obtained feature vectors through an RADA algorithm formed by weak classifier weighting. According to the sentiment word extension-based short text sentiment classifying method, the problem that unregistered words exit in a sentiment dictionary is solved, meanwhile, the problem of sparse sentiment features caused by few effective sentiment words of the short text comment is solved, and the performance and accuracy of sentiment tendency analysis are improved.

Description

technical field [0001] The invention relates to a short text emotion classification method based on emotion word expansion, and belongs to the technical field of computer and information science. Background technique [0002] Chinese short texts are shorter in length and rich in less information. However, the short text user comment information contains the user's emotional tendency. Through short text sentiment analysis, users' true opinions and attitudes hidden under the surface text can be effectively mined. Therefore, the present invention will provide a short text sentiment classification method based on sentiment word expansion to improve the accuracy of comment text sentiment classification, thereby enhancing the practical value of systems such as user evaluation analysis and product recommendation, which has important theoretical significance and commercial value. [0003] The basic problems that need to be solved in the short text sentiment classification method b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/355G06F40/242G06F40/289
Inventor 罗森林李东超潘丽敏毛焱颖
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products