Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Link prediction method in large-scale microblog heterogeneous information network

A technology of heterogeneous information network and prediction method, which is applied in the field of link prediction in large-scale microblog heterogeneous information network, and can solve the problem that the generation of new link relationship has no direct correlation, etc.

Inactive Publication Date: 2016-08-24
SICHUAN UNIV
View PDF2 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] (2) Important laws and phenomena in sociology, such as the Matthew effect, the 28th law, etc., are difficult to be simply represented by similarity
However, these similarities can only characterize certain aspects of the network, and have no proven direct correlation with the generation of new link relationships between network nodes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Link prediction method in large-scale microblog heterogeneous information network
  • Link prediction method in large-scale microblog heterogeneous information network
  • Link prediction method in large-scale microblog heterogeneous information network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0098] Embodiment 1 provides a link prediction method in a large-scale microblog heterogeneous information network, the method includes steps (1) to (5).

[0099] Step (1), filter users according to the preset strategy, and the set of edges in the network after filtering is E; step (1) filter users according to the number of followers, the ratio of the number of followers to the number of fans, and the page sorting value.

[0100] Step (2), extract several links from the network, where the set of positive examples is E T , the set of negative examples is E F .

[0101] Step (3), in E-E T -E F Calculate E in the network T ∪E F The features of all nodes and links in the network are converted into features of link relations, and the final feature set of link relations is X.

[0102] Step (4), the E F ∪E T Divide into training set, verification set and test set, train the model on the training set, select the model hyperparameters that make the prediction result optimal on...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of Internet technology, and provides a link prediction method in a large-scale microblog heterogeneous information network. The link prediction method comprises the following steps: filtering users according to preset strategy; extracting a plurality of links from the network, wherein a positive example set is ET, and a negative example set is EF; calculating the characteristics of all nodes in ETUEF and the characteristics of the links in an E-ET-EF network, and converting the characteristics of the nodes into the characteristics of link relations; dividing the EFUET into a training set, a verification set and a test set, training models on the training set, selecting model hyper-parameters with the optimal prediction result on the verification set to obtain a final model h theta (x) and a threshold value theta; putting any link relation in the test set into the model, so that the probability P generated by the link relation can be obtained. Experiments show that the area under curve and F value of the method provided by the invention are obviously improved compared with a method based on local information similarity and path similarity, and the method has higher maximum K accuracy rate stability.

Description

technical field [0001] The invention belongs to the technical field of the Internet, and in particular relates to a link prediction method in a large-scale microblog heterogeneous information network. Background technique [0002] At present, with the rapid development of mobile Internet and the wide application of search engines, portal media, social networks, etc., the Internet has gradually become a platform containing massive information. Sina Weibo is the most widely used microblog system in China. Since its launch in 2009, the number of registered users has exceeded 500 million. In the microblog system, users can send microblogs (similar to a message, with a word count of less than 140), comment on microblogs, forward microblogs, etc. The link relationship in Weibo includes friend relationship, follow relationship, @ relationship and so on. These relationships are all directed and can be represented as a directed graph. Microblog is a typical heterogeneous informati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/955G06F16/95
Inventor 李川李旺龙
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products