Distributed keyword query method based on RDF graph

A query method and keyword technology, applied in the field of information retrieval, can solve the problems of inability to meet the explosive growth of data, data imbalance between subgraphs, and different quantities.

Inactive Publication Date: 2018-12-07
ZHENGZHOU UNIV
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the number of vertices in each subgraph divided by this method is the same (or similar), but the number of edges associated with each vertex is different, resulting in data imbalance between subgraphs
At the same time, with the continuous expansion of the graph scale, the limitations of traditional algorithms (such as KL, DFEP, VSEP) on the graph scale make these algorithms unable to meet the demand of explosive data growth.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed keyword query method based on RDF graph
  • Distributed keyword query method based on RDF graph
  • Distributed keyword query method based on RDF graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0055] Embodiment: this patent adopts real data set swetodblp ( http: / / lsdis.cs.uga.edu / Projects / SemDis / Swetodblp), the data subject is information on published articles in the field of computer science. The data contains a total of 681636 triples, the storage occupies 53.6MB, and the number of edges and vertices are 1026375 and 373219 respectively.

[0056] The flow chart of the present invention's distributed keyword query method based on RDF graph, such as figure 1 As shown, it can be seen from the figure that it mainly includes the following three stages:

[0057] The first stage: the figure 2 The RDF graph in is transformed into an RDF sentence graph, such as image 3 shown. It can be seen from the figure that the unweighted directed RDF graph is transformed into a weighted and undirected RDF sentence graph with vertices, and the numbers in the RDF sentence graph represent the number of RDF triples contained in the sentence.

[0058] The second stage: edge segment...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention designs a distributed keyword query method based on an RDF graph, and belongs to the field of information retrieval. The method comprises the steps of firstly, converting the RDF data graph into an RDF sentence graph; secondly, by utilizing a conditional depth-first algorithm and a simulated annealing algorithm, segmenting the RDF sentence graph according to two most basic principlesincluding minimum of an edge cut set and data balance between sub-graphs after segmentation; and finally, refining the segmented RDF sentence graph into the RDF data graph, obtaining a vertex cut setof the RDF graph, and by utilizing a reverse search algorithm and a Hadoop distributed computing framework, realizing efficient and fast query of keywords. Under the condition of guaranteeing the atomicity and semantic integrity of RDF data, the limitation of a traditional algorithm on the segmentation efficiency of a large-scale data set is effectively eliminated, and the query efficiency of thekeywords is greatly improved.

Description

technical field [0001] The invention relates to a distributed keyword query method based on an RDF graph, belonging to the field of information retrieval. Background technique [0002] Graph is a ubiquitous data structure widely used in various fields. Keyword query based on RDF graph structure is a current research hotspot, which allows users to obtain efficient query results without using complex structured query language. Most of the current query algorithms are implemented in a centralized environment, that is, keyword queries can only be processed on a single computer. In fact, as the scale of RDF graphs continues to expand, it is very time-consuming to perform keyword queries on a single machine. Therefore, graph processing and storage in a distributed environment has very important theoretical value and practical significance. [0003] At present, the commonly used keyword query technology is to represent RDF data as a directed graph with labels, the vertices in the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 郑志蕴丁阳李钝张行进王振飞
Owner ZHENGZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products