The invention relates to
RDF (
Resource Description Framework) graph
data processing. In order to provide a high-efficiency parallel query
processing method for the large-scale
RDF graph data, reduce read-write times of disks and improve query efficiency, the invention adopts the technical scheme that an
SPARQL (Simple Protocol And
Rdf Query Language) parallel query method facing the large-scale
RDF graph data comprises the following steps: 1, describing the
RDF graph data by using a bulk synchronous parallel (BSP) model; 2, marking by using URIs (Uniform Resource Identifiers) of resources; 3, for each triple in an
RDF graph data set, i.e. a subject calculating unit S, a predicate P and an object calculating unit O, establishing a directed edge e from the subject calculating unit S to the object calculating unit O, using an URI of the predicate P as a mark of the e and storing related information of the e in a local
data field of the subject calculating unit S; 4, for each edge e in the step 3, using an URIr as a mark of an er; 5, acquiring an query request q0 submitted by a user; 6, selecting different propagation paths to carry out propagation; 7, estimating a quantity of information contained in each clause in the qi-1 by utilizing a
greedy algorithm; 8, repeatedly carrying out the steps 6 and 7 until all the clauses are bound. The
SPARQL parallel query method is mainly applied to graph
data processing.