Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop

A distributed query and processing method technology, applied in the field of distributed query processing, can solve the problem of low query efficiency of massive RDF data, achieve the effect of alleviating pressure and improving query efficiency

Inactive Publication Date: 2013-05-22
CHONGQING UNIV
View PDF2 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to use the Hadoop platform to solve the problem of low query efficiency of massive RDF data, and propose a distributed query processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop
  • Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop
  • Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0049] The RDF data set used in this embodiment is the standard data set provided by SP2Bench and the standard SPARQL query statement. SP2Bench is an open source public testing platform for SPARQL queries. It provides an RDF standard dataset generator and multiple complex SPARQL standard statements. The standard dataset generator provided by SP2Bench can generate datasets of any size, and the generated data is stored in N3 format files. The SPARQL statement provided by the SP2Bench platform is more comprehensive, including various operators such as Optional and Union.

[0050] 2 master nodes and 8 slave nodes are used to build the Hadoop platform. The two master nodes are respectively used as namenode / jobtracker nodes, configured as 2-core Intel Pentium4CPU, 2GB memory, and 80GB hard disk; 8 slave nodes are used as datanode / tasktracker nodes, and the configuration It is 2-core Intel Pentium4CPU, 1.5GB memory, 80GB hard disk. Comparing the currently popular Semantic Web fram...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a volume radio direction finde (RDF) data distribution type query processing method based on a Hadoop platform and belongs to the field of the computers. The method mainly comprises the following steps. Step a, RDF data can be uploaded to a hadoop distributed file system (HDFS), data can be read by a MapReduce frame of the Hadoop platform and stored in a distributed database HBase. Step b, a simple protocol and RDF query language (SPARQL) inquiry statement section which is provided by a user can be preprocessed. Statements can be analyzed and extracted a prefix statement, an outcome variable and a picture-model sub sentence. Step c, prefix characters of the picture-model sub sentence can be restored, and the restored picture-model sub sentence can be converted into a tree model. Step d, the tree model can be resolved. Tree joints can be traversed in a bottom-up method and a left-to-right method and inquiry plans can be generated, wherein the inquiry plans are matched with each joints. The final inquiry plans can be sent to the Hadoop platform. Step e, data can be read form the HBase through the MapReduce frame. Distributed query can be implemented according to the inquiry plans. Eventually, the outcome variable can be returned to an inquiry result.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to a distributed query processing method for massive RDF data sets based on Hadoop. Background technique [0002] At present, the Semantic Web (Semantic Web) is developing rapidly and the Resource Description Framework (RDF) is widely used. The data described by RDF is growing exponentially. How to store and query massive RDF with high performance and easy scalability has become an urgent problem to be solved. question. Traditional Semantic Web tools such as Jena, Sesame, and RDF3X use a centralized processing method in a stand-alone environment and use a relational database as a storage system. Facing massive RDF data, their storage capacity and query efficiency are severely limited. [0003] Cloud computing uses distributed technology to provide a set of high-performance, easily scalable distributed storage and computing systems, and has become an optimal solution f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 张小洪杨丹李珩谢娟成正斌洪明坚葛永新杨梦宁徐玲胡海波
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products