Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Optimization method and application of global height number vertex set communication

A vertex and communication speed technology, applied in the direction of electrical digital data processing, digital computer components, multi-programming devices, etc., can solve problems such as unbalanced vertices, unbalanced graph data load, etc.

Pending Publication Date: 2022-08-09
RESEARCH INSTITUTE OF TSINGHUA UNIVERSITY IN SHENZHEN
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For large-scale graph computing, the load of graph data is seriously unbalanced, which is manifested in the severe imbalance of the edges of different vertices, and the degree difference of different vertices is very large. At this time, both one-dimensional and two-dimensional partitions will face scalability problems. One-dimensional vertices The partition method will make too many heavy vertices deploy near-global agents, and the 2D vertex partition method will make too many vertices deploy agents on rows and columns

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimization method and application of global height number vertex set communication
  • Optimization method and application of global height number vertex set communication
  • Optimization method and application of global height number vertex set communication

Examples

Experimental program
Comparison scheme
Effect test

no. 1 approach

[0035] I. First embodiment: 1.5-dimensional graph division based on 3-level degree

[0036] According to an embodiment of the present invention, a mixed dimension division method based on three types of degree vertices is provided, and the vertex set is divided into extreme heights (E for Extreme, for example, the degree is greater than the total number of nodes), heights (H for High , for example, the degree is greater than the number of nodes in the supernode), regular vertices (R for Regular), and sometimes R-type vertices are also called L-type vertices (L for Low, that is, a vertex with a relatively low degree); for directed graphs , then it is divided according to the in-degree and out-degree respectively, the in-degree division set and the out-degree division set are marked as Xi, Xo, where X is E, H or R. In this embodiment, at the same time, a predetermined number of nodes form a super node, and the communication between nodes within the super node is faster than the ...

no. 2 approach

[0056] II. Second Embodiment: Sub-Iterative Adaptive Direction Selection

[0057] According to an embodiment of the present invention, a sub-iteration adaptive direction selection supporting fast exit is also proposed.

[0058] In many graph algorithms, the direction in which the graph is traversed, i.e. "pushing" from source vertices to target vertices or "pulling" source vertices from target vertices, can greatly affect performance. E.g:

[0059] 1. In BFS (breadth-first search), if the “push” mode is used for the wider middle iteration, a large number of vertices will be repeatedly activated, and if the narrower head and tail iterations use the “pull” mode, the proportion of active vertices will be very high. Low results in many useless computations, requiring automatic switching between the two directions;

[0060] 2. In PageRank, the subgraphs that traverse the graph locally and reduce should use the "pull" mode to achieve optimal computing performance ("pull" is random...

no. 3 approach

[0073] III. Third Embodiment: Segmentation Subgraph Data Structure Optimization for EH Two-Dimensional Subgraphs

[0074] In the power-law distribution graph, it is found by simulation that the number of edges of the subgraph EHo→EHi will account for more than 60% of the whole graph, which is the most important computational optimization goal. For this sub-graph, an optimized implementation scheme is proposed by introducing a segmented sub-graph (SSG) data structure.

[0075] The Compressed Sparse Row (CSR) format is the most common storage format for graphs and sparse matrices. In this format, all adjacent edges of the same vertex are stored contiguously, supplemented by an offset array to support its indexing function. Because the range of vertices in a single large graph is too large, the spatiotemporal locality of the data accessing neighboring vertices is very poor. The following segmented subgraph method is proposed: For the subgraph (sparse matrix) of both the source v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a graph calculation method based on distributed parallel calculation, a distributed parallel calculation system and a computer readable medium. The graph calculation method comprises the steps that data of a graph to be calculated are obtained, the graph comprises a plurality of vertexes and edges, and for sub-graphs with the degrees of source vertexes and target vertexes larger than a preset threshold value, the following operations are executed for reduction distribution (Reduce Scatter) type communication of the target vertexes: for reduction on rows and columns, a ring algorithm is used; for global reduction, firstly, messages are transposed locally, so that data are changed from row priority to column priority, then reduction distribution on rows is called firstly, and then reduction distribution on columns is called, so that cross-super node communication on the columns is reduced. According to the graph calculation method disclosed by the invention, through set communication optimization, cross-super-node communication on columns can be reduced, and redundant data in super nodes are eliminated in the reduction in the first step through hierarchical reduction distribution, so that cross-super-node communication in the second step is minimum and has no redundancy.

Description

technical field [0001] The present invention generally relates to three types of vertex degree-aware 1.5-dimensional graph partitioning methods and applications, and more particularly relates to large-scale graph computing methods, distributed parallel computing systems, and computer-readable media. Background technique [0002] Graph computing frameworks are a class of general-purpose programming frameworks used to support graph computing applications. On China's new generation of Shenwei supercomputer, a new generation of "Shentu" super large-scale graph computing framework is provided to support large-scale graph computing applications with the largest scale of the whole machine, tens of trillions of vertices, and three trillion edges. [0003] Graph computing applications are a class of data-intensive applications that rely on data consisting of vertices and edges connecting two vertices to perform computations. Typical applications include PageRank for web page importa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F15/173G06F9/50
CPCG06F15/17368G06F15/17381G06F9/5083Y02D10/00
Inventor 曹焕琦王元炜
Owner RESEARCH INSTITUTE OF TSINGHUA UNIVERSITY IN SHENZHEN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products