Consistent hash based structural data storage, inquiry and migration method

A structured data, consistent technology, applied in the direction of structured data retrieval, database indexing, electronic digital data processing, etc., can solve the problems of large database query delay, unbalanced distribution of data blocks, and reduce the parallel efficiency of traversing data

Active Publication Date: 2014-10-01
SHANDONG UNIV
View PDF2 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. Unbalanced storage seriously affects the efficiency of parallel traversal
[0005] When HDFS stores table data, it selects storage nodes for data blocks according to the load of each data node in the cluster. Data nodes with less load are preferentially selected for storage. This storage strategy does not consider the stored data. The association between blocks, when the data flow is heavy, because most of the data blocks will be stored on nodes with less load, the distribution of data blocks belo

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Consistent hash based structural data storage, inquiry and migration method
  • Consistent hash based structural data storage, inquiry and migration method
  • Consistent hash based structural data storage, inquiry and migration method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0069] Definition 1 Hash value of data block: For data block B in the HDFS system, use its data block label as the key to carry out consistent hashing, and the obtained hash value H_b(B) is called the Hash value of data block B.

[0070] Definition 2 node Hash value: For the data node D in the HDFS system, use its physical address as the key to carry out consistent hashing, and the obtained hash value H_d(D) is called the Hash value of the data node D.

[0071] Define a 3-node Hash chain: set 1 ,H_d 2 ,…,H_d n >It is the sequence obtained by sorting the Hash values ​​of each data node in the HDFS system in ascending order, where: H_d k k+1 , (1≤kk ) means H_d k The corresponding data node, then the linear structure [DN(H_d 1 ),DN(H_d 2 ),…,DN(H_d n )] is called the node Hash chain of the HDFS system, where DN(H_d k+1 ) is called DN(H_d k )'s suc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a consistent hash based structural data storage, inquiry and migration method. The method comprises steps as follows: a consistent hash based HDFS (hadoop distributed file system) data storage model is established, data storage and data inquiry are performed on the basis of the model, and data migration is performed when a data node is added or fails; the data storage method is as follows: data blocks to be written into a file are subjected to consistent hash, Hash values of the data block are acquired, storage nodes of the data blocks are looked up in a node Hash chain according to the Hash values of the data blocks, and content of the data blocks is stored into the storage nodes of the data blocks. According to the method, on the basis of principal and subordinate structures of a HDFS cluster, structural data are uniformly dispersed onto the data nodes of the HDFS cluster by applying consistent hash, the data parallel traversing efficiency is effectively improved, when the number of data nodes changes, the number of nodes related to data migration and the total migration data amount can be greatly reduced, and the operating performance of the data storage system is improved.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to a structured data storage, query and migration method based on consistent hashing. Background technique [0002] For the storage and management of massive structured data, the relational database with Hadoop Distributed File System (HDFS) as the underlying storage is currently the main solution. The basic idea of ​​HDFS is to divide a file into several fixed-size data blocks for storage. Its architecture adopts a master / slave structure system. An HDFS cluster includes a name node (Namenode) and several data nodes (Datanode). Among them, the name node is the main node, which is responsible for controlling the access of external clients and storing the metadata of the whole system. The metadata includes namespace, mapping of files to data blocks, system configuration information, etc.; Store actual file data, i.e. HDFS data blocks. In order to improve the reliabilit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/214G06F16/22
Inventor 程杰杨萌萌
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products