Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Graph data processing method applied to distributed computing node cluster and medium

A distributed computing and computing node technology, applied in special data processing applications, other database retrieval, other database indexing, etc., can solve the problems of large average communication distance, poor scalability, performance impact, etc., to improve communication efficiency, Effect of reducing the amount of transmitted data and communication volume

Active Publication Date: 2020-10-02
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the average communication distance of these existing implementation methods is large, and with the increase of the number of communication nodes and the amount of communication data, the communication overhead increases exponentially
This leads to high network communication overhead when the BFS algorithm is applied on a distributed cluster, its performance is seriously affected, and its scalability is not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph data processing method applied to distributed computing node cluster and medium
  • Graph data processing method applied to distributed computing node cluster and medium
  • Graph data processing method applied to distributed computing node cluster and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below through specific embodiments in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0028] As mentioned in the background technology section, in the field of graph processing, when graph traversal is performed based on distributed computing node clusters, as the number of computing nodes and the amount of communication data increase, the communication overhead increases exponentially. The impact of graph processing efficiency is getting more and more serious. The invention reorders the vertex IDs, compresses the data and communicates in a ring through the out-degree of the vertices, reduces the amount of transmitted data and communication between computing nodes, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a graph data processing method, which comprises the following steps of: S1, obtaining graph data containing a plurality of vertexes, sorting the vertexes according to the descending order of the vertexes, and taking a sorting serial number as a first resorting ID; S2, sequentially distributing the vertexes of the graph data to each distributed computing node in a distributed computing node cluster in a polling manner according to the first reordering ID and preset granularity; S3, traversing the obtained partial graph data by the computational nodes byusing a hybrid BFS algorithm, and obtaining a local next-layer active vertex set by each computational node after each layer is traversed; S4, after each layer is traversed, performing loop communication between adjacent computing nodes to transmit a local next-layer active vertex set, and determining the compression mode of the local next-layer active vertex set to be transmitted this time beforethe local next-layer active vertex set is transmitted after partial-layer traversal. According to the invention, vertex IDs are subjected to reordering, data compression and loop communication through the out-degree of the vertexes, and the communication efficiency is improved.

Description

technical field [0001] The present invention relates to the field of graph data processing, in particular to a distributed graph processing method based on breadth-first search, and more specifically to a graph data processing method and medium applied to distributed computing node clusters. Background technique [0002] Graph is a mathematical object representing the relationship between elements (people, intersections, documents, etc.). Many application scenarios in real life need to be represented by graph data structures, such as protein structure prediction, shortest time path, scientific literature citation relationship, social network analysis, etc. [0003] Breadth First Search (BFS algorithm for short) is a classic graph traversal algorithm. The traditional BFS algorithm adopts a top-down (Top-down) idea to find child nodes through parent nodes. The application of BFS algorithm has the typical characteristics of data-intensive applications such as poor data locali...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/903G06F16/901
CPCG06F16/90335G06F16/9024
Inventor 聂娜王国波曹华伟叶笑春
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products