Microblog spread prediction method based on user influences and contents

A technology of spreading prediction and influence, applied in the direction of data processing applications, special data processing applications, instruments, etc., can solve the problem that the prediction of microblog network information dissemination has not received enough attention, to improve accuracy, improve accuracy, improve The effect of extraction speed

Inactive Publication Date: 2017-07-28
WUHAN UNIV
View PDF7 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

According to the current research status, the traditional information dissemination model has a relatively mature dissemination theory, but the research and analysis of information dissemination prediction in the microblog network has not received enough attention.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Microblog spread prediction method based on user influences and contents
  • Microblog spread prediction method based on user influences and contents
  • Microblog spread prediction method based on user influences and contents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0032] One, first introduce the whole method process of the present invention, including:

[0033] Step 1: The scrapy program creates distributed spiders, using the Sina Weibo ID of the input user and its fans to personal information of the user and fans (http: / / weibo.cn / attgroup / opening?uid=id), user and fan Retweet relationship between fans, retweet microblog (http: / / weibo.cn / id / profile?filter=1&page=1) for information capture. The captured information includes, for users: user name, user Sina ID, user Sina Weibo label, to-be-predicted Weibo text content, to-be-predicted release time of Weibo, number of user fans, user attention; fan aspect: fan name, fan Sina ID, Sina tags of fans, total number of Weibo posts by fans, number of Weibo posts reposted by fans, and reposting time;

[0034] Step 2: Use PageRank technology to calculate the user's authority in the microblog network, the calculation formula is:

[0035]

[0036] Where Vi represents the user ID;

[0037] F(Vi)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a microblog spread prediction method based on user influences and contents. The method includes the following steps that firstly, by means of the scrapy technology, personal information of two users, the forward relationship between the two users and forwarded microblogs are crawled; secondly, the user influences are extracted through the RankPage influence analysis technology to form a user authority predictive factor; thirdly, the percentage of fans forwarded microblogs in all published microblogs in unit time is used, and a fan forward activeness predictive factor is extracted; fourthly, the microblog contents are subjected to importance analysis through the TF-IDF word weight technology to extract a microblog importance predictive factor; fifthly, the extracted forward relationship is divided into 10 microblog forward training sets and microblog ignoring training sets through the snowball sampling method; sixthly, the training sets are trained by means of a monitored Bayesian network till classifier parameters are converged. By means of the method, the prediction accuracy of Sina microblogs forwarded by special fans can be improved.

Description

technical field [0001] The invention relates to data mining in computer science, scrapy framework, HTML data package analysis, machine learning, computer network, probability theory and mathematical statistics, etc., especially a microblog propagation prediction method based on user influence and content. Background technique [0002] The Scrapy framework is a mature, fast and high-level web crawling framework developed using the python language. It provides various types of base class crawlers for extracting structured information from web pages. PageRank web page ranking technology is a technology that calculates the rank of web pages based on the mutual hyperlinks between networks. Nowadays, this technology is mostly used to calculate the importance of nodes in the network structure. TF-IDF technology is a statistical method of information retrieval, which is used to evaluate the importance of a word to a text set or a corpus in the field of data mining. [0003] The mic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06Q50/00
CPCG06F16/951G06Q50/01
Inventor 郭晓东刘金硕王丽娜章岚昕杨广益陈煜森李扬眉
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products