Distributed SPARQL query optimization method based on minimum attribute cut

A query optimization and distributed technology, applied in the field of distributed systems, can solve the problems of large impact on query performance, low query efficiency, and large restrictions, and achieve the effect of reducing data communication time, improving filtering effect, and reducing the number of

Pending Publication Date: 2022-03-01
HUNAN UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, inter-partition connections involve data communication and additional computing overhead, which has a greater impact on query performance
Moreover, in the traditional method of partitioning by vertex, the query that can be executed independently can only be star-shaped, which is relatively restrictive. When processing general queries, distributed connections are usually performed, so the query efficiency is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed SPARQL query optimization method based on minimum attribute cut
  • Distributed SPARQL query optimization method based on minimum attribute cut
  • Distributed SPARQL query optimization method based on minimum attribute cut

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The specific embodiments of the present invention will be further described below in conjunction with the accompanying drawings, so that those skilled in the art can understand the present invention more easily. It should be noted that the embodiment described below is only an embodiment of the present invention, but not all embodiments. Other embodiments obtained by those skilled in the art based on the embodiments of the present invention without making creative efforts all belong to the protection scope of the present invention.

[0022] In order to facilitate description and understanding, the symbols and concepts involved in the embodiments of the present invention are explained:

[0023] G: RDF data graph.

[0024] L: The attribute set of edges in the RDF data graph.

[0025] q(v): the query graph to which the query point v belongs.

[0026] G[L′]: induced subgraph of L′ attribute set, which is a subgraph composed of attribute edges in L′.

[0027] DS(L'): The...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed SPARQL query optimization method based on minimum attribute cut, which belongs to the field of distributed systems, and comprises the following steps: (1) reading an original RDF data graph, and storing an edge attribute set L; (2) calculating a weak connected component and a corresponding cost of each edge attribute; (3) selecting internal attributes as many as possible to obtain a coarsened graph of the data graph; (4) carrying out vertex division on the coarsened image, and carrying out anti-coarsening processing to obtain a final partition; (5) decomposing the SPARQL query into a group of sub-queries which can be independently executed; and (6) executing the decomposed sub-queries in each partition in parallel to obtain a matching result. According to the method, the query types which can be independently executed in the distributed RDF system are expanded, the connection among the partitions is reduced, the data communication time is shortened, and the query efficiency is improved.

Description

technical field [0001] The present invention relates to the field of distributed systems, and more specifically, relates to data division and query processing of distributed RDF systems. Background technique [0002] RDF (Resource Description Framework) is a data model proposed by the W3C organization. It uses the basic form of the triple <subject, predicate, object> to represent the attributes and relationships of web resources. It is currently used in knowledge graphs, social network analysis and other fields. Both have applications. The RDF data model has a flexible representation form, and can be represented not only as a table in a relational database, but also as a graph model. When RDF is expressed as a graph, a triple represents a directed edge from the subject to the object and the two connected vertices, the subject and the object are the two vertices of the edge, and the predicate is the label on the directed edge. While W3C proposed RDF, it also proposed ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2453G06F16/242
CPCG06F16/2453G06F16/2433
Inventor 彭鹏田桢秦拯
Owner HUNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products