Method for configuring and querying an HBase multidimensional query system based on an Hilbert curve and an R-tree

A multi-dimensional query and construction method technology, applied in the construction and query field of HBase multi-dimensional query system based on Hilbert curve and R-tree, can solve problems such as poor usability and real-time performance, difficulty in meeting application requirements, and low query efficiency , to achieve the effect of quick query

Active Publication Date: 2015-03-11
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, HBase only has indexes for one-dimensional data, not multidimensional data indexes
When querying multidimensional data, only the full table can be scanned, and Filter is used for filtering, so the query efficiency is low
Poor ease of use and real-time performance, it is difficult to meet the needs of most applications

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for configuring and querying an HBase multidimensional query system based on an Hilbert curve and an R-tree
  • Method for configuring and querying an HBase multidimensional query system based on an Hilbert curve and an R-tree
  • Method for configuring and querying an HBase multidimensional query system based on an Hilbert curve and an R-tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] This embodiment adopts 5 IBM X3650M4 servers as the test platform, and the hardware and software configuration of each server is as follows; CPU: 2*Xeon E5-2620CPU (6Cores 12Thread); Memory: 32G Bytes; Hard Disk: 6T Bytes, 10000rpm, raid5; Operation System: CentOS 6.4x86_64; Development tools: Eclipse, GNU Toolkits (GCC, G++, GDB), Vim, etc.; Development language: Java, C++; Hadoop version: cl oudera hadoop-2.3.0-cdh5.0.1HBase version: cloudera hbase -0.96.1.1-cdh5.0.1

[0038] Suppose the data dimension of the object to be indexed is N=2, the multidimensional range to be managed is ([0, 800], [0, 400]), the size of each grid unit is 200*100, and there are 6 multidimensional data instance objects , each data object is identified by a unique identifier ID, and the two-dimensional attributes of the points corresponding to each ID are shown in Table 1.

[0039] Table 1 Two-dimensional attributes of each point object

[0040]

[0041] Suppose the query range of query Q i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for configuring and querying an HBase multidimensional query system based on an Hilbert curve and an R-tree. The invention reduces multi-dimensional data to one-dimensional data by using a Hilbert curve on the one hand, and creating an R tree with respect to the multi-dimensional data on the HBase on the other hand. A correspondence is created between information of an identifier Hilbert ID of a mapped one-dimensional Hilbert curve and an original high-dimensional data ID. By the R-tree, query of high-dimensional data can be efficiently mapped to a set of one-dimensional Hilbert ID, so that quick query of multi-dimensional data on the HBase is realized.

Description

technical field [0001] The invention relates to the construction and query method of an HBase multi-dimensional query system based on Hilbert curve and R-tree. Background technique [0002] In recent years, various data management technologies have been constantly innovating. Among them, the Hadoop open source product series has been widely recognized in commercial practice and has almost become the de facto big data management industry standard platform. One of the main features of the Hadoop cloud computing platform is that it can provide scalable computing power and storage capacity, and the realization of interactive data query is also the concern of users, which is the key factor for the success of cloud computing. As a NoSQL storage system, HBase is specially designed for fast random read and write of large-scale data. As a sub-project of the Apache Hadoop project, a distributed, column-oriented open source database, HBase is different from general relational database...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/221G06F16/2246G06F16/2264G06F16/24
Inventor 王国仁王波涛黄山祝景阳刘增兰
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products