Check patentability & draft patents in minutes with Patsnap Eureka AI!

Hadoop data distribution method based on hybrid coding

A hadoop cluster and data distribution technology, applied in the field of distributed systems, can solve the problem of three copies occupying a large space and achieve the effects of saving storage space, ensuring system reliability, and reducing hardware requirements

Active Publication Date: 2016-10-12
HUAZHONG UNIV OF SCI & TECH
View PDF8 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] But such a three-copy redundancy mechanism obviously still has a serious problem, and the redundant storage of three copies will take up a lot of space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop data distribution method based on hybrid coding
  • Hadoop data distribution method based on hybrid coding
  • Hadoop data distribution method based on hybrid coding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0028] Such as figure 1 As shown, the hdfs structure diagram in the embodiment of the present invention is specifically described as follows:

[0029] The Hdfs architecture is mainly composed of Namenode and Datanode. Due to the rack-aware nature of hdfs, one Rack will contain multiple Datanodes, and one cluster will have multiple Racks.

[0030] Such as figure 2 As shown, the data writing p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a hadoop data distribution method based on hybrid coding. The method comprises the steps of changing a three-copy storage mode in a traditional hadoop system, implementing distributed storage on data in combination with RAID1 and RAID5 technologies, and dividing a file into a plurality of blocks to be stored in a cluster when the file is written into the cluster by a user; ensuring that each block can generate a copy according to the concept of RAID1, setting a stripe length to store the blocks, respectively placing the blocks and copies thereof in one stripe into nodes under different machine frames, implementing XOR on the blocks in one stripe based on RAID5 to generate a verifying block, and also storing the verifying block by following the mode of RAID5; and ensuring that the file can be recovered according to the copies or the verifying block when the block is lost due to node damages so as to guarantee the reliability of file storage. According to the hadoop data distribution method disclosed by the invention, the RAID1+5 mode based storage is adopted, so the method has relatively good fault-tolerant performance, and in addition, compared with the the traditional hadoop three-copy storage mode, the storage space can be saved by adopting the RAID1+5 hybrid mode.

Description

technical field [0001] The invention belongs to the field of distributed systems, and more specifically relates to a hadoop data distribution method based on hybrid coding. Background technique [0002] With the popularization and development of computer technology and network applications, the increasing number of users has led to explosive growth in data storage. How to store big data more reliably and efficiently poses a challenge to the data storage system. [0003] Traditional network storage systems use centralized storage servers to store all data. Storage servers become the bottleneck of system performance and the focus of reliability and security, which cannot meet the needs of large-scale storage applications. The distributed network storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, uses location servers to locate and store information, and disperses and stores data on multiple independent devices, which no...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L29/08
CPCH04L67/1095H04L67/1097
Inventor 王芳文可胡燏翀张晓阳常拴霞肖仁智吴锋李宗玮
Owner HUAZHONG UNIV OF SCI & TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More