Hbase-based data hash processing method and device

A processing method and processing device technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of high node load and low access efficiency, and achieve balanced access pressure, low access efficiency, and high load Effect

Inactive Publication Date: 2014-12-31
FUJIAN NEWLAND SOFTWARE ENGINEERING CO LTD
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is to provide a data hash processing method and device based on hbase, which can effectively solve the problem that in the prior art, the data with high access heat is collected in one node, the load of the node is high, and the access efficiency is low question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hbase-based data hash processing method and device
  • Hbase-based data hash processing method and device
  • Hbase-based data hash processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] In order to describe the technical content, achieved goals and effects of the present invention in detail, the following descriptions will be made in conjunction with the embodiments and accompanying drawings.

[0017] The most critical idea of ​​the present invention is to generate random seeds through the KEY field of each data, take a modulo of the random seeds, and distribute the data to different nodes according to the modulus results, so as to share the access pressure evenly and improve the access efficiency.

[0018] see figure 1 , which is the functional framework diagram of the hbase-based data distribution device, and the details of each layer of the streaming computing data distribution device are as follows:

[0019] Acquisition layer: responsible for loading interface data into HDFS (HADOOP distributed file storage), and normalizing HFILE (the storage format of KeyValue data in HBase);

[0020] Processing layer: responsible for the definition of HBASE tab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an Hbase-based data hash processing method. The Hbase-based processing method includes uploading an interface document of the Hbase to an HDFS (hadoop distributed file system) or device; transforming the interface document into HFILE storage format; acquiring KEY fields of data of the normalized interface document; generating random seeds according to the KEY fields respectively, and performing modulus on the random seeds with amount of physical storage nodes of the Hbase as divisor to acquire node numbers; uploading data corresponding to the KEY fields to the physical storage nodes corresponding to the node numbers, respectively. The invention further discloses an Hbase-based data hash processing device. According to the arrangement, loads of the nodes can be balanced, and node access efficiency is improved.

Description

technical field [0001] The invention relates to the field of big data processing, in particular to an hbase-based data hash processing method and device. Background technique [0002] In the era of mobile Internet, the amount of user behavior data in the mobile communication industry has surged. In the field of data analysis, advanced big data technology is used for data analysis and data access, such as hbase. However, in the actual application process, there is often a problem of parallel access to a certain batch of data access. This is mainly because hbase uses a monotonically increasing or sequential key when writing data to gather highly accessed data into a region (physical node) In this case, the access is concentrated on the region, and the performance of the cluster cannot be exerted. [0003] There are two ways to implement HBASE query: one is the get method to obtain the only record according to the specified RowKey, and the other is the scan method to obtain a ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/3331
Inventor 朱爱军叶潇陈威林菓
Owner FUJIAN NEWLAND SOFTWARE ENGINEERING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products