Online graph division method for heterogeneous graph data

A graph data and heterogeneous technology, applied in the field of online graph division for heterogeneous graph data, can solve the problems of different computing time, difficulty in meeting real-time constraints, low efficiency of graph computing system, etc., and achieve the effect of low time complexity

Pending Publication Date: 2022-07-26
INST OF SOFTWARE - CHINESE ACAD OF SCI +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the calculation process for heterogeneous graphs, the existing graph partitioning algorithm does not take into account the following problems in the processing of heterogeneous graphs: for such heterogeneous graphs, during the graph calculation process, the data stored by different types of nodes and edges The size of the occupied storage space may be different. For different node and edge types, different processing algorithms may be used, and the calculation time will also be different.
However, none of the existing graph partitioning algorithms are optimized for memory and computing load balancing for the characteristics of the above-mentioned heterogeneous graphs.
For example, some mainstream graph computing systems, such as GraphX, PowerGraph, etc. (refer to Gonzalez J E, Xin R S, Dave A, et al. Graphx: Graph processing in a distributed dataflow framework[C] / / 11th{USENIX}Symposium on Operating Systems Design and Implementation({OSDI}14).2014:599-613; Gonzalez JE,Low Y,Gu H, et al.Powergraph:Distributed graph-parallel computation on natural graphs[C] / / 10th{USENIX}Symposium on OperatingSystems Design and Implementation ({OSDI}12).2012:17-30), the graph partitioning algorithms used in it, such as hash partitioning, balanced partitioning, block partitioning, etc., do not take into account the calculation speed of heterogeneous graph nodes and edge data, The unbalanced characteristics of storage space lead to the low efficiency of the graph computing system in this case, and it is difficult to meet the real-time constraints brought to the computing process by the periodic arrival of massive data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online graph division method for heterogeneous graph data
  • Online graph division method for heterogeneous graph data
  • Online graph division method for heterogeneous graph data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments and the accompanying drawings. It should be understood that the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of the present invention.

[0026] image 3 This is a flow chart of graph division and graph computation in the present invention. The present invention will be described below in conjunction with a heterogeneous graph containing three types of node functions and carrying different amounts of data on edges. The point data of the heterograph that has come:

[0027] vertexId vertexType vertexAttr1 vertexAttr2 vertexAttr... 1 3 0 0 ... 2 2 0 0 ... 3 2 0 0 ....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a heterogeneous graph data-oriented online graph division method. The method comprises the following steps of: 1) evaluating the calculation speed imbalance and the storage space imbalance of a graph calculation system; determining the calculation speed imbalance of the graph calculation system according to the node function time complexity T of different types of nodes in the heterogeneous graph calculation of the graph calculation system; according to the storage space Sv occupied by the data carried by the different types of nodes and the storage space Se occupied by the data carried by the different types of edges in the heterogeneous graph calculation performed by the graph calculation system, determining the storage space imbalance of the graph calculation system; and 2) according to the node function time complexity T corresponding to different types of nodes, the storage space Sv and the storage space Se corresponding to different types of edges, distributing the current to-be-processed heterogeneous graph data to different partitions. According to the method, the task allocation in the graph calculation is optimized, so that the load and memory use in the graph calculation process are more balanced, and the result of improving the running efficiency of the graph calculation is achieved.

Description

technical field [0001] The invention belongs to the technical field of graph computing and real-time, and in particular relates to an online graph division method for heterogeneous graph data. Background technique [0002] Graph is an important data structure with more powerful expression ability, and can integrate different sources and different types of data into the same graph for analysis, and obtain results that are difficult to find by independent analysis. Therefore, graph computing can be widely used. It is widely used in social networks, recommender systems, network security, text retrieval, and biomedical fields. Many problems can be efficiently solved with the help of graph-related algorithms under the support of graph theory. In recent years, with the rapid development of technologies such as big data, machine learning, and data mining, the scale of abstracted graphs in many fields has grown exponentially. In order to cope with the challenges brought by massive d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/901G06F9/50
CPCG06F16/9024G06F9/5083
Inventor 乔颖赵新朋王宏安刘道伟赵高尚冷昶郭超平
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products