Unlock instant, AI-driven research and patent intelligence for your innovation.

Large-scale associated data division method and system based on attribute graph

A technology of associated data and attribute graphs, which is applied in clustering/classification of other databases, retrieval of other databases, indexing of other databases, etc., can solve problems such as invalid semantic information between data, huge communication overhead, high cost, etc., to improve data Query efficiency, reducing cross-partition communication, and the effect of path number balance

Active Publication Date: 2019-09-17
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although in the early stage of data partitioning, the method of hash partitioning can obtain balanced partitioning very quickly, but the operation of graph data in the later stage will become very time-consuming
Although the hash partition method can indeed distribute the data relatively evenly among the partitions, the hash partition method does not take into account the structure of the graph between the data, so it is very likely that the data with a high degree of semantic correlation or close association will not be Dividing the data into the same partition not only invalidates the semantic information between the data, but also must perform a costly distributed join (join) operation in the later query to obtain the final output by merging the intermediate results in the query process. Involves a lot of cross-partition communication in
Data operations in the late stage of the hash division method will cause very high costs, and operations involving parallel operations will cause very huge communication overhead

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large-scale associated data division method and system based on attribute graph
  • Large-scale associated data division method and system based on attribute graph
  • Large-scale associated data division method and system based on attribute graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0062] The large-scale relational data division method based on the attribute graph provided by the present invention, such as figure 1 shown, including:

[0063] (1) Construct an attribute graph according to the associated data to be divided;

[0064] When constructing a property graph, a specific property graph data model can be defined according to actual application requirements; in an opt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a large-scale associated data division method and system based on an attribute graph, and belongs to the field of data division. The method comprises the steps of (1) building the attribute graph according to the to-be-divided associated data; (2) obtaining the initial vertexes of all paths in the attribute graph so as to obtain an initial vertex set; (3) traversing the initial vertex set, taking the traversed initial vertex as a path starting point, obtaining the paths meeting the constraints between path attributes in the attribute graph, and forming a path group, thereby obtaining a plurality of path groups after the traversing is finished; (4) dividing the associated data by taking the path group as a unit so as to obtain k divided blocks, wherein k is the number of machine nodes in the distributed graph data management system. According to the present invention, the cross-partition communication during the data query process can be reduced, and the data query efficiency is improved.

Description

technical field [0001] The invention belongs to the field of data division, and more specifically relates to a large-scale associated data division method and system based on an attribute graph. Background technique [0002] Due to the rapid development of fields such as social network analysis, machine learning, and data mining, Linked Data is experiencing explosive growth. Linked Data is a specification recommended by the Internet Consortium (W3C) to publish and connect various data, information and knowledge. Due to the increase in the amount of data, the semantic relationship between linked data has also become very complicated. [0003] As the scale of linked data continues to expand, it becomes increasingly difficult to perform storage operations on a single node, and the storage capacity of a single computing node is far lower than the growth of data. At present, the main solution to the problem that a single node cannot handle large data is to divide large-scale da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/901G06F16/906
CPCG06F16/9024G06F16/906
Inventor 袁平鹏金海庞皓翰
Owner HUAZHONG UNIV OF SCI & TECH