Data storage method based on RS (Reed-Solomon) erasure codes

A data storage and erasure code technology, applied in the field of information processing, can solve problems such as wasting space and increasing costs, and achieve the effects of reducing costs, ensuring efficiency, and saving storage space

Inactive Publication Date: 2010-09-22
SHANGHAI JIAO TONG UNIV
3 Cites 45 Cited by

AI-Extracted Technical Summary

Problems solved by technology

But the biggest disadvantage of this technology is that it is a waste of space: a fi...
View more

Method used

In this example, any data blocks no more than 2 damaged in the same data group have been successfully restored, and storage space has been greatly saved: if there is only one backup, then (5,3) RS coding in the present embodiment will Save 4/9 space...
View more

Abstract

The invention belongs to the technical field of information processing, relating to a data storage method based on RS (Reed-Solomon) erasure codes. The method comprises the following steps of: carrying out blocking processing and grouping processing on files to be stored; transmitting original data blocks of each data group to data nodes, and carrying out RS encoding by the data nodes so that each data group is increased with a plurality of redundant data blocks on the basis of the original data blocks; storing the encoded data blocks in the same data group on a plurality of machine racks, wherein the number of the data blocks in the same data group on the same machine rack does not exceed the number of the redundant data blocks; adopting the RS erasure codes to restore the data blocks when the data blocks in the stored files are damaged, and obtaining the restored data blocks. The invention saves a large amount of storage space on the basis of ensuring daily usage efficiency. The requirements for the reliability of data storage can be flexibly set according to the importance of the files. More abundant storage strategies can be provided by setting the number of backup files and encoding parameters, and the cost for data storage is greatly reduced.

Application Domain

Memory adressing/allocation/relocationStatic storage +1

Technology Topic

Computer hardwareInformation processing +5

Image

  • Data storage method based on RS (Reed-Solomon) erasure codes

Examples

  • Experimental program(1)

Example Embodiment

[0024] Example
[0025] The present embodiment is used for storing the file of 180M, specifically comprises the following steps:
[0026] In the first step, the 180M file to be stored is divided into 6 original data blocks of equal size according to a fixed size of 30M, and these 6 original data blocks are divided into two groups, each group including 3 original data blocks.
[0027] In the second step, the original data block in each data group is transmitted from the client to the data node block by block. When the i-th data group is transmitted, the data node that receives the data of the data group transmits the received data to A data node designated for the management node, the data node obtains all the original data block information of the group of data blocks, 1≤i≤2.
[0028] In the third step, the data node that obtains all the original data block information of the i-th data block performs (5, 3) RS encoding on the data group, so that the i-th data group changes from 3 original data blocks to 3 original data block and 2 redundant data blocks.
[0029] The RS coding described in this embodiment is to sequentially process the 4-bit data in each original data block according to the following formula to obtain 2 redundant data of 4 bits corresponding to every 3 4-bit original data. Example 1536 4-bit first redundant data and 1536 4-bit second redundant data are obtained in total, and the 1536 4-bit first redundant data are combined into the first redundant data block in order, and the 1536 4-bit The 4-bit second redundant data is combined into the second redundant data block in order, that is, two 30M redundant data blocks are obtained respectively, and the specific formula is:
[0030] FD=C, (Formula 1)
[0031] Among them: F is a 2×3 Vandermonde matrix, D is a 3×1 matrix composed of three 4-bit original data in the i-th data group, C is two 4-bit redundant data in the i-th data group after encoding A 2×1 matrix composed of remaining data.
[0032] In this example F = 111 123 .
[0033] The fourth step is to store the coded data blocks in the same data group on several racks, and there are no more than two data blocks in the same data group on the same rack.
[0034] In the fifth step, when a data block in the storage file is damaged, the RS erasure code is used to restore the data block to obtain the restored data block.
[0035] Taking the first group of data groups as an example, the recovery data block is:
[0036] 1) When the number of damaged data blocks is less than or equal to 2, the original data block is obtained by the following formula, and then the damaged original data block is obtained or the damaged redundant data block is obtained by RS coding;
[0037] A'D=E', (Formula 2)
[0038] in: I is a 2×2 unit matrix, F is a 2×3 Vandermonde matrix, D is a 3×1 matrix composed of three 4-bit original data in the first data group, and C is the first encoded A 2×1 matrix composed of two 4-bit redundant data in the data group, A' is the new matrix after removing the row corresponding to the damaged data block from A, and E' is the row corresponding to the damaged data block removed from E new matrix after row.
[0039] Because the matrix F is a Vandermonde matrix, any two rows or one row of the matrix A are linearly independent, so the matrix A' must be invertible, and the value of D can be calculated from (Formula 2) using the Gaussian elimination method It can be obtained that all original data blocks can be sequentially recovered by using (Formula 2) according to each calculation of 4-bit data.
[0040] 2) When the number of damaged data blocks is greater than 2, the damaged data blocks cannot be recovered.
[0041]In this example, any damaged data block no more than 2 blocks in the same data group is successfully restored, and the storage space is greatly saved: if there is only one backup, the (5, 3) RS encoding in this embodiment will save 4/ 9 spaces (9 blocks are required for three backups in the prior art), which saves a lot of cost in large-scale use; and by setting backup parameters and encoding parameters, the settings are more flexible and the reliability level is more refined.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Travel network cell division method based on Simhash algorithm

InactiveCN105138647Asave storage spaceAlgorithm is simple
Owner:SHAANXI NORMAL UNIV

Classification and recommendation of technical efficacy words

  • save storage space
  • low cost

Method of encoding structured low density check code

InactiveCN101141133AGood frame error rate performancesave storage space
Owner:BEIJING UNIV OF POSTS & TELECOMM +1

De-block effect filtering device and method

InactiveCN101409833Asave storage spaceloose timing
Owner:昆山杰得微电子有限公司

Distributed cache method and system

ActiveCN103019960AAvoid redundant storagesave storage space
Owner:浙江杭海新城控股集团有限公司

System and method for transmitting wireless digital service signals via power transmission lines

ActiveUS7929940B1reduce bandwidth requirementlow cost
Owner:NEXTEL COMMUNICATIONS

Plastic waveguide-fed horn antenna

InactiveUS20100214185A1low cost
Owner:RGT UNIV OF CALIFORNIA

System and method for determination of position

InactiveUS20090149202A1low costreduce requirement
Owner:STEELE CHRISTIAN

Adaptive antenna optimization network

InactiveUS6961368B2low costminimal space
Owner:ERICSSON INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products