Large-scale graph data query method in distributed environment based on Datalog

A technology of distributed environment and query method, which is applied to the query of large-scale data. In the field of large-scale data query in the distributed environment based on Datalog, it can solve the cumbersome and inefficient writing of graph data processing scripts by users and the difficulty of large-scale data query performance. To meet application requirements and other issues, to achieve operation function optimization, optimization within rules, and achieve the effect between rules
CN102799624BInactive Publication Date: 2015-03-04PEKING UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
PEKING UNIV
Publication Date
2015-03-04
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a large-scale graph data query method in a distributed environment based on Datalog. The method comprises the following steps of: 1) performing grammatical analysis on a large-scale graph query instruction based on a Datalog rule set input by a user, and producing a corresponding grammatical tree; 2) constructing an execution plan in which a Datalog rule is used as a unit according to the grammatical tree, and constructing a corresponding Map execution function and a Reduce execution function according to each Datalog rule; and 3) implementing inter-rule optimization, inner-rule optimization and operation function optimization by using an equivalence rule and statistical data, and improving the efficiency of a large-scale graph query execution plan. By the large-scale graph data query method, the cost that a final user writes a graph query script is simplified; expanded recursion Daralog query is provided; and the user can express the corresponding large-scale graph query by using a simple description language. The invention also provides a method for constructing a MapReduce environment execution plan for recursion Daralog query. Datalog graph query can be executed under a MapReduce framework.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention specifically relates to querying large-scale data in a distributed environment, in particular to a method for querying large-scale data in a distributed environment based on Datalog, and belongs to the field of information technology. Background technique

[0002] In modern society, graphs are used more and more widely. The rapid development of technologies in the fields of social networks, bioinformatics, and traffic navigation has produced large-scale graph data. How to effectively manage these large-scale data faces many challenges: First, the traditional stand-alone computing model is difficult to support the management of large-scale data, and the storage capacity of a single-machine is limited, so it is difficult to load the entire large-scale data into memory. The processing capability of large graph data is also insufficient, and it is difficult to effectively support various complex operations on large graph data; secondly, the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More