HBase table conjunctive query optimization method

A joint query and external table technology, applied in the field of computer information, can solve the problem of not providing table joint query, and achieve the effects of easy promotion, improved query efficiency, and strong practicability

Inactive Publication Date: 2014-04-16
LANGCHAO ELECTRONIC INFORMATION IND CO LTD
View PDF2 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, among the commands and APIs provided by HBase itself, only scan is used to query data, and does not provide the function of table joint query. Therefore, a method of HBase table joint query is very important

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • HBase table conjunctive query optimization method
  • HBase table conjunctive query optimization method
  • HBase table conjunctive query optimization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0030] Embodiment: First, deploy a distributed cluster environment according to the steps of the official website, and the hardware environment in the cluster is shown in Table 1. The operating system is centos6.3. Then correctly start services of hdfs, mapreduce, hbase and hive in the normal order in the server cluster.

[0031] machine type CPU model Number of Cores Memory disk capacity Number of machines 5280m3 Xeon(R) CPU E5-2620 0 2.00GHz 24 96G 6050G 11

[0032] In this example, the format of the source data text is shown in Table 2. The data contained in the file is 1 billion pieces, and the file size is 11G. Upload the file to HDFS, and then import the file into the HBase table. In addition, a small table is created in HBase, and a joint query is performed on the two tables. The structure diagram of HBase table joint query is attached figure 1 As shown, first establish associations with Hive tables according to the tabl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an HBase table conjunctive query optimization method which includes the following steps: enabling an Hive to be combined with an HBase, achieving HBase table conjunctive querying through an HQL language provided by the Hive, then conducting optimization by setting parameters having influences on a bottom layer MapReduce task, and improving the conjunctive query performance. Compared with the prior art, the HBase table conjunctive query optimization method has the advantages that unnecessary programming trouble is reduced, parallel processing can be carried out on conjunctive query tasks, the query efficiency is improved, the practicability is high, and popularization is easy to achieve.

Description

technical field [0001] The invention relates to the field of computer information technology, in particular to a method for optimizing joint query of HBase tables. Background technique [0002] With the popularization of large-scale Internet applications and the massive increase of network information data, big data has already had a huge impact on national governance, corporate decision-making, and personal life. In the background era of big data, distributed file systems and distributed databases are all technologies suitable for big data. HBase is different from general relational databases, it is a distributed database suitable for unstructured data storage. As HBase continues to improve in performance and stability, HBase has been adopted by many large companies. HBase is an open source non-relational NoSQL scalable distributed database. It is column-oriented and suitable for storing very large loose data. HBase can randomly read and write large data. However, amon...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2456G06F16/24532
Inventor 宗栋瑞郭美思
Owner LANGCHAO ELECTRONIC INFORMATION IND CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products