ART-based (Adaptive Radix Tree based) distributed system graph storage and computing system and method

A technology of distributed systems and computing methods, applied in computing, file systems, database distribution/replication, etc., can solve problems such as high communication overhead, unbalanced load, difficult representation and partition of natural graphs, etc., to reduce preprocessing time, The effect of reducing memory usage and improving indexing efficiency

Active Publication Date: 2017-05-31
NAT UNIV OF DEFENSE TECH
View PDF6 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This feature makes it difficult to represent and partition natural graphs in a distributed environment
The edge splitting system distributes vertices evenly by cutting off the edges between subgraphs, but for high-dimensional points, it will cause unbalanced load in computation and communication
The point splitting system evenly distributes the edges of high-dimensional points by splitting vertices instead of edges between subgraphs, but for low-dimensional points, this will cause high co...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • ART-based (Adaptive Radix Tree based) distributed system graph storage and computing system and method
  • ART-based (Adaptive Radix Tree based) distributed system graph storage and computing system and method
  • ART-based (Adaptive Radix Tree based) distributed system graph storage and computing system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026]The present invention provides a specific embodiment of an ART tree distributed system graph storage computing system, which implements an independent graph computing engine GraphA on Spark, which provides an adaptive, unified graph partition algorithm, which The purpose of splitting the data set in a load-balanced manner is achieved by using the hash function with increasing sequence numbers, and the ART-index adjacency list storage algorithm is introduced on the graph system to achieve efficient storage. Experimental results show that GraphA is superior to some existing graph computing systems, such as GraphX, in terms of storage overhead, graph loading and partitioning time, and graph computing time, regardless of real-world natural graphs and artificially synthesized graphs.

[0027] The system includes a data source unit, a data partition unit, a data storage unit, and a graph calculation unit; the data source unit is provided with a data acquisition module, and the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses ART-based (Adaptive Radix Tree based) distributed system graph storage and computing system and method and relates to the technical field of distributed graph computing. The system comprises a data source unit, a data partitioning unit, a data storage unit, and a graph computing unit; the data source unit is provided with a data acquisition module used for acquiring graph data, the data storage unit comprises a database, a file system, a distributed file system and HBase, and the data partitioning unit comprises a data loading module and an adaptive partitioning algorithm module. The system comprises the data source unit, the data partitioning unit, the data storage unit and the graph computing unit; the data source unit is provided with the data acquisition module used for acquiring graph data, the data storage unit comprises the database, the file system, the distributed file system and the HBase, and the data partition unit comprises the data loading module and the adaptive partitioning algorithm module.

Description

technical field [0001] The invention relates to the technical field of distributed graph computing, in particular to an ART tree-based distributed system graph storage computing system and a method thereof. Background technique [0002] Large-scale graph computing is critical to a wide range of machine learning and data mining applications, from natural language processing to social networking. People have conducted in-depth research on single-machine graph computing models, and many systems, such as GridGraph, GraphQ, GraphChi, and X-Stream, have achieved very high computing performance. Currently, the rapid growth of dataset size poses serious challenges to single-machine models, but at the same time it promotes the development of graph-parallel systems, such as Pregel, GraphLab, PowerGraph, GraphX, and PowerLyra. [0003] GraphX ​​(d Gonzalez, Joseph E., et al. "Graphx: Graph processing in distributed dataflow framework." 11th USENIX Symposium on Operating Systems Design...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/13G06F16/182G06F16/2246G06F16/2255G06F16/27
Inventor 章成飞张一鸣李东升
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products