Short microblog text-oriented sentiment analysis method and system

A sentiment analysis and short text technology, applied in semantic analysis, special data processing applications, instruments, etc.

Active Publication Date: 2016-12-07
GUANGZHOU DATASTORY INFORMATION TECH CO LTD
View PDF9 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The purpose of the present invention is to solve the above-mentioned problems in the prior art, to provide a sentiment analysis method and system for microblog short texts, which belong to the field of network information processing technology, and can effectively solve the problems when Chinese microblog data sets appear. The problem of emotional orientation recognition when the distribution of emotional orientation is unbalanced, and the implementation is very simple, the recognition rate is high, and it has strong practical application value and practical significance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short microblog text-oriented sentiment analysis method and system
  • Short microblog text-oriented sentiment analysis method and system
  • Short microblog text-oriented sentiment analysis method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0060] like figure 1 and 2 As shown, the present embodiment is a kind of emotion analysis method for microblog short text, comprising the following steps:

[0061] Steps of generating pseudo samples, preprocessing steps, expanding microblog steps, feature extraction steps, sentiment analysis model training steps and emotional tendency recognition steps.

[0062] The specific content of each step is described below:

[0063] 1. Step of generating pseudo samples: generate pseudo samples using mixed Gaussian distribution.

[0064] In this embodiment, a mixed Gaussian distribution model is used to generate pseudo-samples for the minority classes in the training set, where the minority classes refer to the classes that account for a minority of emotional tendencies in the training set, so as to construct a training set with balanced emotional tendencies.

[0065] The mixed Gaussian distribution generation pseudo-sample technology of the present invention is divided into the foll...

Embodiment 2

[0088] A kind of emotion analysis system facing microblog short texts in this embodiment includes the following execution modules:

[0089] Generate pseudo-sample module, preprocessing module, extended microblog module, feature extraction module, sentiment analysis model training module and emotional tendency recognition module.

[0090] The specific content of each module is described below:

[0091] 1. Generate pseudo-sample module: generate pseudo-sample by using mixed Gaussian distribution.

[0092] In this embodiment, a mixed Gaussian distribution model is used to generate pseudo-samples for the minority classes in the training set, where the minority classes refer to the classes that account for a minority of emotional tendencies in the training set, so as to construct a training set with balanced emotional tendencies.

[0093] The mixed Gaussian distribution generation pseudo-sample technology of the present invention is divided into the following steps:

[0094] (1) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a short microblog text-oriented sentiment analysis method and system. The method comprises the steps of firstly generating pseudo samples by utilizing mixed Gaussian distribution: generating the pseudo samples for a minority of classes in a training set by utilizing a mixed Gaussian distribution model, thereby establishing a training set with sentiment tendency distribution equilibrium to reduce the influence of sentiment tendency distribution disequilibrium of the data set on a sentiment classification effect; and secondly performing microblog text preprocessing, Word2vec microblog expansion, characteristic extraction, sentiment analysis model training and sentiment tendency identification. By utilizing the scheme provided by the method and the system, the problem in sentiment tendency identification during occurrence of the sentiment tendency distribution disequilibrium in the data set of a Chinese microblog can be effectively solved; and the method and the system are very simple to implement and high in identification rate, have very high practical application values and are of very strong practical significance.

Description

technical field [0001] The invention belongs to the technical field of network information processing, and in particular relates to an emotion analysis method and system for microblog short texts. Background technique [0002] Weibo, as a common social platform, carries massive amounts of information. How to effectively analyze and mine users' emotions in Weibo is very meaningful. In the prior art, like the traditional sentiment analysis work, the sentiment analysis methods for Weibo can be divided into two categories. One is based on sentiment lexicons and rules, which identify sentiment tendencies by counting the number of negative and positive sentiment words in a sentence. The other category is based on machine learning methods, which train models by picking appropriate features. [0003] For example, CN104331506A in the existing patent literature discloses a multi-class sentiment analysis method and system for bilingual microblog texts, which belongs to the technical ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/30
Inventor 梁礼欣吴文杰李本栋
Owner GUANGZHOU DATASTORY INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products