Distributed origin guarantee regular path query algorithm based on Pregel

A distributed and path-based technology, applied in computing, special data processing applications, instruments, etc., can solve problems such as high communication costs and lack of efficient and scalable regular path query algorithms, to ensure correctness, reduce query response time, and improve performance effect

Inactive Publication Date: 2018-09-11
TIANJIN UNIV
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method may lead to high communication costs when dealing with large-scale RDF graph data
[0009] At present, according to our investigation, there is no efficient and scalable regular path query algorithm under distributed origin guarantee semantics on RDF large graph data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed origin guarantee regular path query algorithm based on Pregel
  • Distributed origin guarantee regular path query algorithm based on Pregel
  • Distributed origin guarantee regular path query algorithm based on Pregel

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In order to further understand the invention content, characteristics and effects of the present invention, the following examples are given, and detailed descriptions are as follows in conjunction with the accompanying drawings:

[0051] A kind of Pregel-based distributed origin guarantee regular path query algorithm of the present invention comprises the following steps:

[0052] 1) For a given regular path query Q=(x, r, y), calculate the first, last and follow sets according to the regular expression r. The regular expression is defined recursively as r::=ε|p|r / r|r|r|r * , where ε is an empty string, p is any character in the alphabet Σ, / stands for connection, | stands for connection, and * stands for closure. The first set is the state set corresponding to the beginning character of any string in the language L(r) represented by the regular expression r, and the last set is the end character of any string in L(r) The corresponding state set, the follow set is t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed origin guarantee regular path query algorithm based on Pregel. The distributed origin guarantee regular path query algorithm based on the Pregel includes the steps that 1, regarding a given regular path query Q=(x, r, y), according to a regular expression r, first, last and follow sets are calculated; 2, a Glushkov automaton A=(St, sigma, delta, q0, F) equivalent to the regular expression r is further built; 3, a Pregel information transmission model is matched with the regular path query in RDF graph data, and through an information transmission mode, result paths are obtained after calculation is conducted by taking a peak as a center; 4, all the result paths meeting the regular expression r are counted to serve as query results. According to the distributed origin guarantee regular path query algorithm based on the Pregel, the Glushkov automaton can be adopted, the origin guarantee regular path query is conducted on the large-scale RDF graph data, and by introducing optimizing strategies, the purpose of reducing query time and intermediate results to improve algorithm expansibility is achieved.

Description

technical field [0001] The invention relates to the field of distributed graph query, in particular to the field of regular path query for large-scale RDF graph data. Background technique [0002] With the increasing popularity of knowledge graphs, more and more fields adopt Resource Description Framework (RDF) as the standard format for data representation and storage. Compared with the traditional relational model, RDF more naturally describes and reflects the things and their connections in the real world. With the large-scale emergence of graph data, efficient distributed graph query based on multi-machine cluster system has become an inevitable choice. Regular Path Queries (RPQs) is an indispensable basic graph query operation, which aims to find all paths satisfying regular expressions in a navigational way, and generally returns a series of matching data node pairs. The standard query language for RDF graph data recommended by W3C, SPARQL, also introduced the proper...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王鑫辛月祺
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products