Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Large-scale graphical partition method based on vertex cut and community detection

A large-scale, graph partitioning technology, applied in the field of computer science, can solve problems such as difficulty in determining distribution function parameters, affecting task completion time, and insufficient natural graph processing performance, so as to reduce network traffic, improve task throughput, and reduce network traffic. traffic effect

Active Publication Date: 2014-04-02
HUAZHONG UNIV OF SCI & TECH
View PDF5 Cites 42 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In a distributed environment, the graph partition algorithm in the graph computing processing framework directly affects the processing efficiency of the framework. The existing frameworks all use a simple Hash algorithm. Although it is simple and fast, it can only satisfy load balancing. The traditional MGP (Multilevel Graph Partition) partition scheme has insufficient processing performance for natural graphs, because it cannot take into account the Power-Law distribution characteristics of natural graphs, so the partition efficiency is not high, and bottlenecks often occur in the communication traffic of nodes during iterations, which greatly affects tasks. Completion time, which in turn affects the computing performance and service quality of the overall platform
[0004] Of course, with the deepening of research, some emerging solutions have emerged, such as streaming-based solutions, which abstract the loading of graphs into incremental streaming data, and use some simple heuristic partitioning algorithms to fully consider the graph partitioning. The minimum edge cut and vertex equalization implement different algorithms, but cannot solve the Power-Law graph division
There are also those who regard the generation of graph partition results as the generation of binary trees, and combine the processing of vertex task allocation with graph partition to propose a graph partition scheme with distributed M / S structure in the cloud environment, which cannot solve the problem of Power-Law graphs. to divide
In addition, based on the label propagation in community clustering to guide the graph division, iteratively calculate the label of each calculation vertex until the label value does not change, and then divide according to the traditional MGP algorithm, which cannot solve the Power-Law graph division
There is also a method based on vertex cutting to solve the problem of communication overhead in natural graph partitioning. It determines the maximum expected value of vertex cutting through the probability density distribution function of the graph, and then uses it as a guide to propose a corresponding greedy heuristic partitioning algorithm. However, due to the need The distribution function of Power-Law is used as a guide, and the determination of the parameters of the distribution function itself is a difficult problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large-scale graphical partition method based on vertex cut and community detection
  • Large-scale graphical partition method based on vertex cut and community detection
  • Large-scale graphical partition method based on vertex cut and community detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0023] Such as figure 1 As shown, the large-scale graph partitioning method based on vertex cutting and community gathering of the present invention includes the following steps:

[0024] (1) Initialize and divide the cluster, including setting the parameters of the cluster software and hardware, starting the cluster, and deploying the division algorithm code.

[0025] Wherein, the cluster software and hardware parameters include disk size, memory size, IP address, mirror directory, etc. of computing nodes; the code deployment includes deployment of running scripts and deployment of algorithm core...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multilayer k-way graphical partition method based on vertex cut and community detection. The method comprises the steps that the distribution of a natural graph is considered according to the statistic analysis property, a corresponding vertex cutting algorithm is provided, vertexes causing longer task completion time are cut, label propagation is iteratively performed on the cut graph by a community detection algorithm based on the label propagation, the label of each vertex of the graph is determined, a community where the vertexes are located is obtained, partitioning is performed by a traditional multilayer k-way graph partition algorithm, and the efficiency is consolidated. For most of application in the large-scale iteration graph processing, distributed computational nodes meet the load balancing, extra communication traffic, due to the iteration dependency necessity, produced by each processing original node between adjacent iteration processing steps is greatly reduced, the task operating efficiency of a graph processing frame is greatly reduced, and the throughput capacity of tasks is increased.

Description

technical field [0001] The invention belongs to the technical field of computer science, and more specifically relates to a large-scale graph partitioning method based on vertex cutting and community gathering. Background technique [0002] With the development of computer technology and the wide application of Web2.0, the amount of data in the Internet is becoming larger and larger, and there are more and more challenges to the processing of these data, one of which is the processing of massive graph data (Fig. calculation), such as PageRank calculation for massive webpage data, social relationship analysis in social networks, network document relationship analysis, etc., because the main feature of graph calculation is that it needs multiple iterations, and the calculation units need to communicate with each other, so the traditional The full-scale computing framework MapReduce is not suitable for graph computing, so a number of dedicated large-scale graph computing framew...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/958
Inventor 谢夏金海吴延赞柯西江
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products