Subgraph matching query method

A query method and seed map technology, applied in the query field, can solve the problems of low query efficiency and achieve the effects of saving communication costs, reducing communication traffic, and reducing memory usage

Inactive Publication Date: 2015-03-04
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF2 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] In order to solve the problem of low query efficiency of existing distributed subgraph matching methods, the...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subgraph matching query method
  • Subgraph matching query method
  • Subgraph matching query method

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0088] by figure 1 As an example, the input is a query graph figure 1 (a), data graph figure 1 The average degree of nodes in (b) is 2.

[0089] [1] Enumerate all nodes as the root node of the tree;

[0090] by figure 1 (a) as an example. enumerate u 1 -u 7 as the root node of the tree.

[0091] [2] Use the method of breadth traversal to obtain the query tree corresponding to all root nodes;

[0092] See Method 1.1 for details.

[0093] [3] For all candidate query trees generated, select the one with the smallest value as the final query tree;

[0094] by figure 1 (b)(c) as an example. data graph figure 1 The average degree of nodes in (b) is 2, according to formulas (1) and (2), we get figure 1 (a) The corresponding query tree is figure 1 (c) shown.

[0095] Method 1.1: Given a root node, use breadth traversal to generate a query tree and output the query tree

[0096] Input: query graph, root node

[0097] output: query tree

[0098] [1] Use the access set ...

example 11

[0106] Example 1.1: Take figure 1 u in (a) 1 As the root node for example:

[0107] [1] Use the edge access set evis to indicate whether the edge (u, v) is in the tree, use the point access set nvis to indicate whether the node has been visited, and use the first-in-first-out queue q as the queue for breadth traversal;

[0108] In the initialization phase, evis, nvis, and q are all empty.

[0109] [2] Add the root node of the query tree to the queue q;

[0110] will u 1 Join queue q.

[0111] [3] Take out the team head node u, and enumerate its adjacent points v in turn;

[0112] to u 1 as an example. visit u in turn 1 Adjacent point u 2 , u 3 , u 4 .

[0113] [4] If the edge (u, v) is already in the tree, skip the edge (u, v);

[0114] to u 5 as an example. u 5 The adjacent point of u is 2 , at this time (u 2 , u 5 ) This edge is visiting u 2 When the adjacent points of the node have been added to the tree, they are skipped.

[0115] [5] If the edge (u, v...

example 2

[0141] by figure 1 (b)(c) as an example.

[0142] [1] Input a data graph into a distributed cluster, and each machine stores a part of the subgraph. Network communication is used between adjacent nodes across machines, and memory communication is used between other nodes;

[0143] Assuming that there are three clusters, the figure 1 (b) v 1 , v 2 , v 3 , v 4 a group, v 5 , v 6 , v 7 , v 8 a group, v 9 , v 10 , v 11 , v 12 A group is input into three clusters respectively, from v 1 to v 2 Using memory communication, from v 1 to v 5 Use network communication.

[0144] [2] Obtain the query tree height - 1 layer node label set;

[0145] exist figure 1 In (c), the obtained label set is {B, D, C}.

[0146] [3] Select the data nodes that meet the label set as the computing node set;

[0147] exist figure 1 In (b), the computing node set is v corresponding to {B, D, C} 2 , v 3 , v 4 .

[0148] [4] i=1, N=query tree height, root node height is 1;

[0149] i=...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a subgraph matching query method, and belongs to the field of databases and distributed graph analysis processing. The subgraph matching query method comprises the following steps: firstly, revising a query graph into a query tree; secondly, from a leaf node to a root node, sequentially matching data graph nodes layer by layer according to the query tree to obtain a matching count, and sending the count to an adjacent node till a matching root node; thirdly, from the root node to the leaf node, sequentially sending a matching requirement layer by layer among the data graph nodes according to the query tree till the matching requirement is sent to the leaf node; finally, from the leaf node to the root node, sequentially sending a subtree matching result to a source of the query requirement according to the query tree by the data graph nodes. Compared with the existing distributed method, the subgraph matching query method is greatly improved in aspects of traffic and computational cost.

Description

technical field [0001] The invention relates to a query method, in particular to a sub-graph matching query method for processing large-scale graph data in a distributed system, and belongs to the field of database and distributed graph analysis and processing. Background technique [0002] Graphical models have important applications in many fields, such as social networks, Web networks, planning problems, and biological information. With the wide application of computers and networks, a large amount of graph model data has also shown exponential growth. In 2013, Facebook estimated that the amount of newly generated data per day had reached 500TB. At the same time, the complexity of most graphical model processing methods is usually much higher than O(n). For example, the most common shortest path method, the most common method of floyd, has a computational complexity of O(n^3). Under the large-scale data volume, the computational complexity of single-machine processing i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2246
Inventor 金福生杨艺峰颜震薛野韩翔宇
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products