Text data opinion summary mining method based on topic diversity

A technology of text data and diversity, applied in the field of sentiment analysis and text summarization, can solve the problem that word diversity cannot guarantee the inclusion of opinion summaries, influence opinion summaries, etc., and achieve the effect of accurate topic attributes and wide application.

Active Publication Date: 2018-07-10
FUZHOU UNIV
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage is that most of the existing models consider using the diversity of all words in the text sentence to ensure that the opinion summary covers the main idea of ​​the text, and the diversity of the words is used to ensure the diversity of the summary, but the diversity of words does not guarantee that the opinion summary includes The subject of the source text, words that are not related to the subject will affect the final opinion summary, and the existing research methods describe the emotional information of the summary through the emotional information of the entire text sentence, and the emotions of many irrelevant text subjects are also taken into account. The two factors lead to the final summary containing a lot of content and emotional information that are not related to the main theme of the text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data opinion summary mining method based on topic diversity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be further explained below in conjunction with the accompanying drawings and specific embodiments.

[0020] The present invention provides a text data viewpoint summary mining method based on topic diversity, which includes the following steps: Step S1: preprocessing the topic text, filtering out irrelevant texts without substantial content and meaning and common outages word; step S2: input topic corpus and background corpus; step S3: extract topic attributes of topic corpus; step S4: add emotion polarity to topic attribute obtained in step S3, emotion polarity includes positive emotion, negative emotion, Therefore, positive topic attributes and negative topic attributes are used as emotional attribute features to vectorize sentences; step S5: use the topic attributes obtained in step S3 as evaluation objects, and use the multi-evaluation object-oriented dynamic word sequence sentiment analysis method to analyze sentences The emotional polari...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text data opinion summary mining method based on topic diversity. The method comprises the following steps: S1, preprocessing topic text; S2, inputting a topic corpus and a background corpus; S3, extracting topic attributes of the topic corpus; S4, adding emotional polarity to the obtained topic attributes for vectorizing a sentence; S5, taking the obtained topic attributes as evaluation objects, analyzing the emotional polarity of the evaluation objects contained in the sentence with a dynamic word sequence emotional analysis method for multiple evaluation objects, obtaining emotional attribute characteristics contained in the sentence, and performing characteristic vectorizing on the sentence; S6, constructing a diversity objective function with a text sentence characteristic vector obtained in S5. An opinion summary of a topic text can be obtained efficiently and accurately, and the method can be applied to a larger data set application scene.

Description

technical field [0001] The present invention relates to the fields of text summarization and sentiment analysis, and more specifically, relates to a method for generating short opinion summaries with rich user emotional information for massive topic text data of Chinese microblog corpus, and the opinion summaries can accurately cover the text discussed It can be applied to practical application scenarios such as news summaries and commodity review summaries. Background technique [0002] Currently, there are many techniques and methods available for research in the field of opinion summarization. Traditional view summarization models include graph models and ranking models. The representative methods of graph models include Textrank, PageRank, LexRank and other methods. They use sentences as nodes, and a certain relationship between sentences as the weight of edges, and iteratively update and calculate the scores of sentences through the random walk model, so as to realize ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/334G06F16/374G06F16/9535
Inventor 廖祥文陈国龙赵楠杨定达
Owner FUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products