Check patentability & draft patents in minutes with Patsnap Eureka AI!

Method and system for statistically querying HBase based on analysis performed by Hive on HFile

A statistical query and query table technology, applied in the HBase field based on Hive analysis of HFile statistical query, can solve problems such as machine switching difficulties, long operation waiting time, storage and calculation are not separated, and achieve the effect of improving reaction efficiency

Inactive Publication Date: 2015-12-02
北京思特奇信息技术股份有限公司
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The above three solutions have certain defects in the tight coupling of system architecture, high concurrency, and data consistency. Difficult to separate and cut
High concurrency: Hive is currently used for statistics, the Hbase service node has a large load, and the waiting time for operations is long

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for statistically querying HBase based on analysis performed by Hive on HFile
  • Method and system for statistically querying HBase based on analysis performed by Hive on HFile
  • Method and system for statistically querying HBase based on analysis performed by Hive on HFile

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0032] figure 1 It is a flow chart of the method for statistically querying HBase based on Hive parsing HFile according to the present invention.

[0033] like figure 1 As shown, a method for analyzing HFile statistical query HBase based on Hive, including the following steps:

[0034] Step S1, creating a detailed list storage query table;

[0035] Step S2, based on the detailed list storage query table, the original detailed list is listed in HBase, and stored as an HFILE file in HDFS of HBase;

[0036] In step S3, Hive performs association statistical query based on the HFILE file.

[0037] It also includes direct access to HBase for detailed query.

[0038] like figure 2 As shown, the specific process is: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a method and system for statistically querying a HBase based on analysis performed by a Hive on an HFile. The method comprises the following steps: step S1, creating a detailed-list storage and query table; step S2, adding an original detailed list to the HBase based on the detailed-list storage and query table, and storing the original detailed list as an HFILE file in an HDFS of the HBase; and step S3, the Hive performing relational and statistical query based on the HFILE file. The method also comprises: directly accessing the HBase to search for a detailed list. According to the method and system provided by the present invention, a concurrent data reading operation can be completed efficiently; distributed computing is implemented by using MapReduce so as to execute a computing task in a distributed environment and output a result, wherein computing is independent from each other and can be dynamically expanded; response efficiency of a high-concurrent reading operation on big data is greatly improved; and data is not backed up, meeting the requirement of data consistency.

Description

technical field [0001] The invention relates to the field, in particular to a method and system for statistically querying HBase based on Hive analysis of HFile. Background technique [0002] There are three schemes for querying HBase in the prior art. [0003] Solution 1: Direct access to HBASE to read data. The storage application initiates a read operation command, and the HBase client accepts the data read request, and determines which RegionServer the data should be handed over to for processing based on the read data and the currently cached region. If the Region has no cache, then request Zookeeper to get the RegionServer. The client sends the read request to a specific RegionServer, and the RegionServer first queries in the data cache, and returns directly if it hits. Otherwise, the data is read from the distributed file system, cached and returned to the client. [0004] Solution 2: Based on distributed computing, use HIVE to associate HBASE with statistical que...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/2471
Inventor 牛晓亮
Owner 北京思特奇信息技术股份有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More