Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A compressed storage method for large-scale graph data

A technology for compressing storage and graph data, applied in the field of high-performance computing, it can solve the problems of large data storage space, many random reads, and low thread parallelism. Effect

Active Publication Date: 2016-08-31
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the object of the present invention is to provide a method for compressing and storing large-scale graph data, which expresses the relational graph data into a tree structure, and performs row compression and column compression on the graph data according to the characteristics of the tree structure, It aims to solve the problems of large data storage space, large number of random reads and low thread parallelism existing in existing methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A compressed storage method for large-scale graph data
  • A compressed storage method for large-scale graph data
  • A compressed storage method for large-scale graph data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] like figure 1 As shown, the compression storage method of large-scale graph data in the embodiment of the present invention includes the following steps:

[0025] (1) Process the original graph data in adjacency matrix format and store it in the form of binary adjacency matrix in row units. The adjacency matrix is ​​set as variable M, and the out-degree of each node is recorded;

[0026] (2) Establish a hash index HashIndex according to the offset value of each row in the adjacency matrix M, where the information stored in each row is the start point, the number of end points and the end po...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a compression and storage method for large-scale image data. The method comprises the following steps of (1) storing the data of original images by a binary adjacent matrix M in a behavior unit; (2) according to the deviation value of each row in the adjacent matrix M, establishing a hash index; (3) ascending the starting points of each row in the adjacent matrix M according to the out-degree; (4) recording nodes with in-degree of 0 as root nodes, descending the root nodes according to the out-degree, and recording as a root node list; (5) for each node in the root node list, using each root node as the starting node, and sequentially distributing ID (identity) according to the depth-first strategy; (6) traversing the adjacent matrix M, converting the matrix according to the newly distributed ID, and storing in an edge list format; (7) sequencing the data with the edge list format; (8) compressing and storing the data with the edge list format in a line mode. The method has the advantages that the required data storage space is little, the number of random reading times is little, and the parallel degree of threads is high.

Description

technical field [0001] The invention relates to the field of high-performance computing, and more specifically, relates to a method for compressing and storing large-scale graph data. Background technique [0002] Processing and mining large real-world relational graphs and designing scalable systems has become an extremely pressing problem today. For example, social network graphs, web page pointing graphs, and protein interaction graphs are particularly challenging. Because they cannot really be divided into small pieces that can be processed in parallel, this defect in parallelism has made distributed computing attract the attention of many people in the industry. [0003] In recent years, some models based on graph computing have been proposed, such as Pregel and GraphLab and other vertex-centered vector models. In this model, users define a program that can run each vertex locally; in addition, some graph data is high The performance computing system is based on key-v...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H03M7/30
Inventor 袁平鹏金海张文娅吴步文
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products