The division and parallel distribution processing method of super large-scale rdf graph data

A distributed processing and ultra-large-scale technology, applied in the field of big data processing, it can solve the problems of unbalanced task load, low quality of partition, long partition time, etc., and achieve the effect of high partition quality, improved partition quality, and fast partition speed.

Active Publication Date: 2017-12-29
HUAZHONG UNIV OF SCI & TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the above deficiencies or improvement needs of the prior art, the present invention provides a method and system for the division and parallel distribution processing of ultra-large-scale graph data. And equally divide the super-edge data on the path, so as to take into account the uniformity of data distribution and the balance of task load, and through the use of bit-block transmission and pipeline processing methods, it solves the long division time and the division of existing division methods. Problems with low quality and uneven task load

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • The division and parallel distribution processing method of super large-scale rdf graph data
  • The division and parallel distribution processing method of super large-scale rdf graph data
  • The division and parallel distribution processing method of super large-scale rdf graph data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below may be combined with each other as long as they do not constitute a conflict with each other.

[0074] like figure 1 As shown, the division and parallel distribution processing method of ultra-large-scale RDF graph data of the present invention comprises the following steps:

[0075] (1) Preprocess the original RDF graph data, generate the corresponding hash dictionary file and shaping three-table data, and convert the shaping three-table data into an association matrix M;

[0076] (2) E...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a super-large-scale RDF graph data division and parallel distribution processing method, including: preprocessing the original RDF graph data, generating a corresponding hash dictionary file and shaping three-table data, and shaping the three-table data Convert into an association matrix M; establish a hypergraph model of the association matrix M, in which the subject, predicate and object of M are hyperedges, and data related to hyperedges are hyperedge data; judge the RDF graph Whether the data is a connected graph or a disconnected graph, if it is a disconnected graph, divide the disconnected graph into multiple connected graphs; based on the hypergraph model, the concurrent breadth traverses and equally divides the hyperedge data on the placement path, and divides the hyperedge The data is classified and sorted and equally divided into K parts and placed on K slave nodes. At the same time, the mapping relationship between hyperedge data and slave nodes is established. The invention has fast division speed, high division quality, balanced data and task load, high parallelism and high speed of query processing.

Description

technical field [0001] The invention belongs to the field of big data processing, and more specifically relates to a method for dividing and parallel distribution processing of ultra-large-scale RDF graph data. Background technique [0002] Resource Description Framework (RDF) is the core of the entire Semantic Web system structure, and it is widely used to describe various information resources on the Internet. With the continuous growth of RDF data, the processing on a single machine has become incapable, so the RDF data must be divided into multiple machines for processing. [0003] For the division of ultra-large-scale RDF graph data, commonly used methods include heuristic division and parallel hierarchical division. For the heuristic method, an objective function is generally provided, and then the division is carried out around the optimal direction of this function, but the selection of the objective function is more difficult. For parallel hierarchical division, i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 袁平鹏金海谢昌凤罗毅
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products