Microblog data analysis based hot news prediction method and system

A technology of data analysis and prediction method, which is applied in the direction of network data retrieval, network data indexing, electronic digital data processing, etc. It can solve problems such as the inability to find hot topics and the inability to comprehensively analyze the characteristics of hot topics, so as to solve the problem of early prediction Effect

Active Publication Date: 2016-01-06
SOUTH CHINA UNIV OF TECH
View PDF3 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Traditional public opinion hot topics are judged only by the number of clicks, reposts, comments and other data, but this ho

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Microblog data analysis based hot news prediction method and system
  • Microblog data analysis based hot news prediction method and system
  • Microblog data analysis based hot news prediction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] like figure 1 and figure 2 As shown, the hot news prediction method based on microblog data analysis of the present embodiment includes the following steps:

[0042] S1. Collect news reports from mainstream news websites and the microblog user reaction information caused by them on microblogs. The news reports include titles and texts, and the microblog user reaction information is searched on microblogs using news titles as keywords. The microblog result set includes microblog user information, microblog text, posting time, but does not include news reports in microblog by news media;

[0043] S2. Perform word segmentation and word frequency statistics on the microblog text, calculate the TF-IDF (termfrequency-inversedocumentfrequency) value of the word, and convert it into a microblog topic using a vector space description;

[0044] S3. Classify the microblog topics, describe the three quantitative indicators of the microblog topics, and calculate the three popular...

Embodiment 2

[0078] like image 3 As shown, the hot news prediction system based on microblog data analysis of the present embodiment, the system includes:

[0079] The data collection module is used to collect news reports from mainstream websites and the reaction information of Weibo users on Weibo;

[0080] The text analysis processing module is used to perform word segmentation and word frequency statistics on the microblog text, calculate the TF-IDF value of the word, and convert it into a vector space to describe a microblog topic;

[0081] The data statistical analysis module is used to classify Weibo topics, count and describe various quantitative indicators of Weibo topics, and calculate various popularity indicators of news;

[0082] The hot news prediction module is used to use the multiple linear regression algorithm to learn the sample data, establish a hot news prediction model, and judge whether the following news will become a hot news according to the hot news prediction ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a microblog data analysis based hot news prediction method and system. The method comprises: acquiring news reports from mainstream news sites and microblog user response information caused by the news reports on the microblog; carrying out word segmentation and word frequency statistics to a microblog text, calculating a TF-IDF value of a word, and converting the value into a microblog topic described by using a vector space; classifying the microblog topics, counting each quantitative index for describing the microblog topics, and calculating each hot index of news; and using a multivariate linear regression algorithm to learn sample data, establishing a hot news prediction model, and determining whether the latter news can become a hot news or not. The system comprises a data acquisition module, a text analysis processing module, a data statistical analysis module and a hot news prediction module. According to the method and system disclosed by the present invention, the trend of news reported by media in microblog topics is comprehensively analyzed to predict whether the news can become a hot news or not in public sentiments, so that the problem of early prediction of hot news can be well solved.

Description

technical field [0001] The invention relates to a hot news prediction method and system, in particular to a hot news prediction method and system based on microblog data analysis, belonging to the field of hot news automatic prediction in government public opinion monitoring. Background technique [0002] With the rapid development of Internet technology, Internet public opinion is increasingly affecting the stable development of society. Monitoring Internet public opinion is an important link for the government to maintain social stability. As one of the links in public opinion monitoring, the prediction of hot news is particularly critical. With its unique dissemination characteristics and real-time interaction characteristics, Weibo has changed the way of dissemination of traditional news information. In particular, the combination of Weibo and mobile terminals enables Weibo information to be forwarded or commented on more quickly, and a large number of user comments and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 陈健韩超
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products