Opinion mining method for ten-million-scale news comments

An opinion mining and news technology, applied in the field of data mining, can solve the problems of short length, large number of news comments, and difficulty in news comment mining.

Active Publication Date: 2015-07-15
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF6 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the large number of news comments, short length, colloquial word

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Opinion mining method for ten-million-scale news comments
  • Opinion mining method for ten-million-scale news comments
  • Opinion mining method for ten-million-scale news comments

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The present invention will be further described in detail with reference to the accompanying drawings and embodiments.

[0032] An opinion mining method for tens of millions of news comments, based on data mining, natural language processing and other technologies, using Chinese word segmentation, clustering and other methods to analyze tens of millions of news comments, and obtain aspects that can express events or important information from the user's point of view.

[0033] First, count the number of comments under each title according to the news titles of a certain event or topic, and group the news comments with a number of comments exceeding a certain value into one category according to the title; The results of word segmentation are clustered; then for each type of news comments, the keyword pairs of this category are extracted, and the proportion and confusion of each type of news comments are calculated; finally, according to the keyword pairs of each categor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an opinion mining method for ten-million-scale news comments. The opinion mining method comprises the following specific steps of firstly, counting the quantity of the ten-million-scale news comments; secondly, judging whether the quantity is greater or equal to a threshold K or not, if yes, discarding the processing, and otherwise, turning into the third step; thirdly, performing word segmentation on news headlines and comments of which the quantity is smaller than the threshold K by using a Chinese word segmentation tool and tagging part-of-speech; fourthly, clustering the news comments according to the work segmentation results to obtain category labels; fifthly, performing keyword pair extraction on the news comments; sixthly, counting the proportion and the complexity of the news comments; seventhly, screening and extracting representative texts according to the keyword pairs. According to the opinion mining method disclosed by the invention, by the use of the Chinese word segmentation tool, the considering of the usage and the matching relationship of Chinese language and the combination of the action of the news headings, the ten-million-scale news comments are processed; the opinion mining method has the advantages of high efficiency, robustness, usability and the like.

Description

technical field [0001] The invention belongs to the field of data mining and relates to an opinion mining technology, in particular to an opinion mining method for tens of millions of scale news comments. Background technique [0002] With the continuous increase of the number of netizens, social media has also developed rapidly, represented by forums, Weibo, and WeChat, which have gradually penetrated into every aspect of people's life and work, and have had a great impact on people's behavior patterns and psychological patterns. profound influence. At the same time, social media will generate a large number of short texts every day, containing a large amount of information expressing events or users' opinions. By analyzing this information, on the one hand, people can understand the information diffusion of a certain event or topic; Public opinion monitoring and social media marketing play an important role. How to extract keywords that can express events or users' opin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 刘春阳程工吴俊杰张旭王卿庞琳李雄袁石
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products