SPARQL query optimization method based on graph traversal

A query optimization and graph traversal technology, applied in special data processing applications, instruments, electrical and digital data processing, etc., can solve problems such as increased query time, high data fragmentation requirements, and reduced system operating efficiency, and achieves the elimination of data structures. The effect of relying on, reducing the generation of intermediate data, and speeding up query and analysis

Active Publication Date: 2017-10-24
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF4 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] 3) High requirements for data fragmentation
If there are a large number of associations between different partitions, SPARQL subgraph matching cannot be executed in parallel on multiple data nodes, which reduces the operating efficiency of the system
[0009] Due to the existence of these problems, SPARQL queries based on federated methods are difficult to effectively cope with the large-scale growth of large-scale RDF associated data and meet the real-time query requirements of knowledge network associated applications. The query time increases with the growth of data size.
However, the Bigtable-based data processing technology is difficult to apply to the processing of massive RDF knowledge network associated data due to the lack of large-scale table connection operations (Join).

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SPARQL query optimization method based on graph traversal
  • SPARQL query optimization method based on graph traversal
  • SPARQL query optimization method based on graph traversal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

[0042] Such as figure 2 As shown, a SPARQL execution engine based on graph traversal consists of Bigtable data storage, SPARQL to Gremlin conversion, and graph traversal execution. According to the rdf:literal feature and rdf:resource feature of the RDF triple object, the present invention uses an attribute graph to represent the association relationship and literal value of the RDF triple, uses the Bigtable data model to store and manage RDF data, and uses graph traversal to realize the SPARQL pair Query and analysis of RDF linked data.

[0043] Currently, data warehouses for RDF data use triples as the basic unit to store and manage RDF knowledge network data, and rely on table self-joining to achieve subgraph matching, thereby realizing SPARQL query and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an SPARQL query optimization method based on graph traversal. The method comprises the steps of 1), expressing triples in RDF (Resource Description Framework) data through utilization of an attribute graph and storing the RDF data through utilization of a Bigtable model, thereby obtaining Bigtable data corresponding to the RDF data; 2), converting SPARQL query into traversal for the RDF attribute graph; and 3), traversing all nodes satisfying conditions in the Bigtable data according to a traversing sequence obtained in the step 2), thereby finishing the SPARQL query. According to the method,

Description

technical field [0001] The invention relates to a SPARQL query execution method based on graph traversal, in particular to a method and system for big data association-oriented storage and query. Background technique [0002] Graph data mining and analysis is a new field of big data, which supports information mining and scientific discovery based on data association by establishing association relationships with web resources, microbial strain resources, and scientific research resources. Resource Description Framework (RDF) is a language used to express information about World Wide Web (World Wide Web) resources, and can express information about anything that can be identified on the Internet, such as page title, author, and modification time and the relationship between different data. The RDF specification provides a basic vocabulary for describing resources, and defines the rules that must be followed when applying such as the WDCM (Mircen World Data Center for Microo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2452G06F16/2453
Inventor 李亮沈志宏周园春黎建辉朱小杰刘东江李跃鹏
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products