Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Generalized maximum degree random walk graph sampling algorithm

A random walk, sampling algorithm technology, applied in computing, data processing applications, instruments, etc., can solve the problem of poor estimation accuracy of sampling algorithm, aggravating the problem of repeated samples, etc.

Inactive Publication Date: 2015-03-25
SHENZHEN UNIV
View PDF4 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The larger the chi-square distance between the two, the worse the estimation accuracy of the sampling algorithm
Obviously, this approach will lead to more self-loops, thus exacerbating the "repeated sample problem"

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generalized maximum degree random walk graph sampling algorithm
  • Generalized maximum degree random walk graph sampling algorithm
  • Generalized maximum degree random walk graph sampling algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Specific embodiments of the present invention will be described in detail below in conjunction with specific drawings.

[0020] The present invention provides a new generalized maximum degree random walk algorithm, hereinafter referred to as GMD algorithm.

[0021] The GMD algorithm introduces a parameter C (C is a non-negative integer) on top of the MD algorithm to control the number of self-loops. Its probability transition equation is as follows:

[0022]

[0023] where C is a non-negative integer.

[0024] Specifically, the GMD algorithm includes two steps: firstly, samples are collected by random walk on the graph through the above-mentioned transition probability; secondly, an unbiased estimate is constructed according to the collected samples. Among them, the detailed process of the first step is as follows:

[0025] Input: graph G = (V, E)

[0026] Output: the collected sample point set S

[0027] 1 Randomly select node u in the graph as the initial node,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a generalized maximum degree random walk graph sampling algorithm. The generalized maximum degree random walk graph sampling algorithm comprises the following steps of enabling a sample to walk on a graph randomly; and performing unbiased estimation according to the sample. A 'large deviation problem' of an RW algorithm and a 'repeated sample problem' of an MD algorithm can be solved effectively, so that the overall efficiency on sample points acquired from the internet is improved.

Description

technical field [0001] The invention belongs to the technical field of large graph data mining, and in particular relates to a generalized maximum degree random walk graph sampling algorithm. Background technique [0002] In recent years, online social network analysis has attracted extensive attention in both academia and industry. In all related studies of online social network analysis, one of the most basic research problems is to estimate the properties of nodes in social networks and the topological properties of the entire social network. However, since many online social network companies, such as Tencent, Sina Weibo, Facebook, and Twitter, do not release their social network graph data to third parties, and the size of the entire social graph data is often too much for third parties. Unknown. Therefore, the majority of researchers and developers engaged in social network analysis are faced with a very difficult data collection problem. The main difficulty here is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06Q50/00
CPCG06F16/00G06F16/9024
Inventor 李荣华邱宇轩毛睿秦璐金檀蔡涛涛
Owner SHENZHEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products