Cluster-based graph data division method

A technology of graph data and block division, which is applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of inability to handle large-scale graph data division, high repetition rate of graph data division method, and division results. Not ideal and other problems, to achieve the effect of reducing memory overhead, reducing repetition rate, and maintaining integrity

Inactive Publication Date: 2017-09-22
HUAZHONG UNIV OF SCI & TECH
View PDF0 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the defects of the prior art, the purpose of the present invention is to solve the technical problem that the prior art graph data division metho

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster-based graph data division method
  • Cluster-based graph data division method
  • Cluster-based graph data division method

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0035] In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

[0036] figure 1 This is a schematic flow chart of a cluster-based graph data division method provided by an embodiment of the present invention, such as figure 1 As shown, the method includes step S101 to step S106.

[0037] S101: Determine the degree value of each of V vertices included in graph data, where the graph data includes E edges.

[0038] Among them, those skilled in the art can understand that there are multiple vertices and multiple edges in the graph data. If there is an edge between one vertex and the other vertex, the two vertices are two of the edge. Endpoint.

[0039] Specificall...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cluster-based graph data division method. The method comprises the steps of determining a degree value of each vertex in V vertexes comprised in graph data, wherein the graph data comprises E edges; according to the degree value of each vertex, sorting the V vertexes according to a degree value sequence from big to small, and determining M vertexes with the degree values greater than a first threshold; according to the V vertexes, the E edges and the M vertexes, determining T paths; attributing the vertexes which T paths do not comprise in the V vertexes to the T paths, forming the T paths after the attribution, or creating N paths by the T paths; according to the M vertexes, the T paths and the N paths, determining M clusters; and according to an association degree among the M clusters and a preset divided block number P, dividing the M clusters of the graph data into P divided blocks. According to the method, the vertexes with the relatively big degree values serve as end pintos of the paths, so that the repetition rate of graph data division can be reduced; the graph scale is reduced by taking the paths as units, so that the memory overhead is reduced; and large-scale graph data can be processed.

Description

technical field [0001] The invention belongs to the field of big data processing, and more specifically relates to a method for dividing graph data based on clusters. Background technique [0002] Data mining of real-world relational graphs is a hot topic nowadays, such as social network analysis, web page ranking. Whether it is multi-threaded parallel processing on a single machine or using a distributed system to process these graphs, the key step of graph data partitioning cannot be avoided. However, graph data has some characteristics such as large data scale and complex relationship, so it is a very challenging task to properly divide it. The graph size reflects the number of vertices and edges included in the graph data. If the graph data includes a large number of vertices and edges, it is large-scale graph data. [0003] In recent years, with the popularization of graph computing, graph data partitioning algorithms have emerged in an endless stream. Random hash (...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/9024
Inventor 袁平鹏金海龙浩
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products