SPARQL parallel query method facing large-scale RDF graph data

A query method and graph data technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of not making full use of SPARQL statement graph structure characteristics, huge data volume, etc., to improve query speed and meet Query requirements, effects that facilitate utilization and management

Active Publication Date: 2014-05-07
TIANJIN UNIV
View PDF3 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] (2) Not making full use of the graph structure features of SPARQL statements
[0007] (3) The

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SPARQL parallel query method facing large-scale RDF graph data
  • SPARQL parallel query method facing large-scale RDF graph data
  • SPARQL parallel query method facing large-scale RDF graph data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The technical scheme adopted in the present invention is:

[0025] 9) Use the BSP model to describe the RDF graph data, and each resource in the RDF graph data is embodied as a calculation unit in the BSP that can perform calculations;

[0026] 10) Use the URI (Uniform resource identifier) ​​of the resource to mark each computing unit corresponding to the resource;

[0027] 11) For each triple (S, P, O) in the RDF graph data set, establish a directed edge e from the subject computing unit S to the object computing unit O, use the URI of the predicate P as the label of e, and set e The relevant information of is stored in the local data field of the subject computing unit S;

[0028] 12) For each edge e in 3), create an opposite edge e r , using the URI r (where URI is the URI of predicate P) as e r mark, and e r The relevant information of is stored in the local data field of the object computing unit O;

[0029] 13) Obtain the SPARQL query request q submitted by ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to RDF (Resource Description Framework) graph data processing. In order to provide a high-efficiency parallel query processing method for the large-scale RDF graph data, reduce read-write times of disks and improve query efficiency, the invention adopts the technical scheme that an SPARQL (Simple Protocol And Rdf Query Language) parallel query method facing the large-scale RDF graph data comprises the following steps: 1, describing the RDF graph data by using a bulk synchronous parallel (BSP) model; 2, marking by using URIs (Uniform Resource Identifiers) of resources; 3, for each triple in an RDF graph data set, i.e. a subject calculating unit S, a predicate P and an object calculating unit O, establishing a directed edge e from the subject calculating unit S to the object calculating unit O, using an URI of the predicate P as a mark of the e and storing related information of the e in a local data field of the subject calculating unit S; 4, for each edge e in the step 3, using an URIr as a mark of an er; 5, acquiring an query request q0 submitted by a user; 6, selecting different propagation paths to carry out propagation; 7, estimating a quantity of information contained in each clause in the qi-1 by utilizing a greedy algorithm; 8, repeatedly carrying out the steps 6 and 7 until all the clauses are bound. The SPARQL parallel query method is mainly applied to graph data processing.

Description

technical field [0001] The present invention relates to the field of RDF (Resource Description Framework, resource description framework) graph data processing and query, in particular, relates to the field of parallel query for large-scale RDF graph data, that is, SPARQL (SPARQL Protocol) for large-scale RDF graph data and RDF Query Language, SPARQL protocol and RDF query language) parallel query method. Background technique [0002] Information on the Internet is sent and received by a large number of computers, but computers do not currently understand the information. In response to this situation, Tim Berners-Lee proposed the concept of Semantic Web in 1998. Resource Description Framework (RDF) is the fundamental data format of the Semantic Web. Due to the very good scalability and flexibility of the RDF graph data format, more and more fields such as social networks and bioinformatics use the RDF format to publish data. Realizing the query of RDF graphs is the basis ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/24532G06F16/2471
Inventor 吕雪栋冯志勇王鑫
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products