Topic modeling based multi-granularity sentiment analysis method

A technology of topic modeling and sentiment analysis, applied in semantic analysis, other database retrieval, network data retrieval, etc., can solve the problems of research work and insufficient application

Inactive Publication Date: 2015-03-25
ZHEJIANG UNIV
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, the research work and application of user sen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Topic modeling based multi-granularity sentiment analysis method
  • Topic modeling based multi-granularity sentiment analysis method
  • Topic modeling based multi-granularity sentiment analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0120] To provide training data to train the core model of the present invention and use it to provide query results when users query the movie review data sub-database aclImdb in social media databases as an example, the steps of training and processing query results of the present invention are as follows:

[0121] 1. Use natural language processing tools to mark the part of speech of each word in the database, and use the obtained part of speech tagging results as the characteristics of each word;

[0122] 2. Remove useless high-frequency words and uncommon words with low frequency;

[0123] 3. After statistical processing, all words that have appeared in the text form a vocabulary;

[0124] 4. According to the parameters automatically set by the system or specified by the user, determine the number of topics and the number of fine-grained emotions used in modeling;

[0125] 5. It is automatically set by the system, or the user specifies the parameters α, β, γ of each grou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a topic modeling based multi-granularity sentiment analysis method. The method includes the following steps: extracting words and word features of all data in a social media text database; performing training to obtain a kernel model; acquiring search results from the social media text database according to a search request of a user; determining number of topics and number of fine-grained sentiments needed in topic modeling according to parameters set by a system automatically or specified by the user; allocating one topic and one fine-grained sentiment to each word randomly; computing the topics and the fine-grained sentiments that all words belong to as well as coarse-grained sentiments expressed by searched documents, and feeding results back to the user. The method has the advantages that social network text data can be subjected to topic modeling and multi-granularity sentiment analysis at the same time; correlation can be established between the word features and the fine-grained sentiments expressed by the words, and the user is assisted in comprehending the data.

Description

technical field [0001] The invention relates to user sentiment analysis, in particular to a multi-granularity sentiment analysis method based on topic modeling. Background technique [0002] At present, with the development of Internet architecture, storage technology and other related technologies, all kinds of network data are increasing rapidly. In addition to providing better browsing experience for Internet users and providing more samples for multimedia retrieval applications, these data also make efficient organization of these large-scale data a challenge. In order to meet this challenge, as a typical algorithm for clustering media data through "hidden topics", unsupervised hierarchical Bayesian models (or topic models) are widely used, such as LDA (Latent Dirichlet Allocation, a a wide range of traditional topic models) and their extensions, etc. Since it was proposed in 2003 until today, LDA and its derivative models have been used as the core algorithm of variou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951G06F40/30G06Q50/01
Inventor 汤斯亮邵健王翰琪吴飞庄越挺
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products