Method and device for storing and reading data in hadoop distributed file system (HDFS)

A data and cold data technology, applied in the field of data storage and reading, can solve the problems of high storage and reading costs of cold data, and achieve the effect of reducing high storage costs, reducing the number of losses, and reducing storage costs

Inactive Publication Date: 2013-04-24
XIAMEN MEIYA PICO INFORMATION
View PDF2 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] It can be seen from this that a technical problem that needs those skilled in the art to solve urgently is exactly: how to solve the detection of hot and cold data in HDFS of the prior art, differentiate and differentiate storage and read cold data, to reduce cold data storage and read in HDFS. The problem of high cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for storing and reading data in hadoop distributed file system (HDFS)
  • Method and device for storing and reading data in hadoop distributed file system (HDFS)
  • Method and device for storing and reading data in hadoop distributed file system (HDFS)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0024] refer to figure 1 , shows a flow chart of an embodiment of a method for storing data in HDFS of the present invention, including: step S11, obtaining the hot and cold values ​​of file data blocks in HDFS through a data cold and hot discrimination mechanism, and comparing the cold and hot values ​​with The set discrimination threshold is compared; step S12, if the hot and cold values ​​are not greater than the discrimination threshold, then the file data block is cold data; step S13, the file data block of the cold data is divided into n blocks , and calculate the m check code blocks corresponding to the n data blocks, wherein m, n are positive integers, and m

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for storing and reading data in a hadoop distributed file system (HDFS). The storing method includes: obtaining a cold and hot value of a file data block in the HDFS according to a data cold hot distinguishing mechanism, and comparing the cold and hot value and a set distinguishing threshold; if the cold and hot value is not larger than the distinguishing threshold, the file data block is cold data; dividing the file data block of the cold data into n blocks, and calculating m verifying coding blocks corresponding to the n data blocks, wherein m and n are positive integral numbers, and m<n; and storing the n data blocks and the m verifying coding blocks. The storing method and the storing system can achieve distinguishing of the cold data and hot data, stores the cold data in distinguishing mode, and reduces storing cost of the cold data in the HDFS. The reading method and the read system can achieve distinguishing of the cold data and the hot data, read and recover the cold data in distinguishing mode, and reduce time and cost for reading and recovering the cold data.

Description

technical field [0001] The invention relates to the field of data storage and reading, in particular, to a method for storing and reading data in HDFS and a storage and reading system. Background technique [0002] Hadoop is a cluster distributed project led by Apache Foundation, which mainly includes two core modules: Map / Reduce programming mode and HDFS (Hadoop distributed File System) distributed file system. Among them, HDFS mainly uses the multiple backup mechanism (usually three copies) of file data blocks and the heartbeat mechanism to achieve high data availability, cluster scalability, and high-speed data read and write. Due to the above characteristics of HDFS, at present, there are nearly a thousand well-known enterprises building cloud storage based on HDFS. [0003] There is no problem with the HDFS storage mechanism for hot data storage and reading and writing, but for cold data storage and reading and writing, since the storage mechanism does not take into a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 庄进发章正道
Owner XIAMEN MEIYA PICO INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products