An Erasure Code Archiving Method Based on Spark Streaming Computing

A streaming computing and erasure coding technology, applied in computing, program control design, multi-programming devices, etc., can solve the problem that the disk archiving scheme cannot meet the requirements, avoid network transmission and CPU computing bottlenecks, and reduce data transmission. amount, the effect of reducing the burden

Active Publication Date: 2022-02-15
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, most researched archiving schemes in Hadoop clusters are performed on disks. With the expansion of memory in the cluster, and more and more applications are extended to the memory of the cluster for calculation, archiving on disks The program is far from meeting the needs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Erasure Code Archiving Method Based on Spark Streaming Computing
  • An Erasure Code Archiving Method Based on Spark Streaming Computing
  • An Erasure Code Archiving Method Based on Spark Streaming Computing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0027] Below at first explain and illustrate with regard to the technical terms that the present invention relates to:

[0028] RS erasure code: (k+r,k)RS erasure code strip is composed of k data blocks and r check blocks. The purpose of using RS code is to improve channel transmission efficiency by adding redundant codes. reliability;

[0029] RS encoding process: the process of calculating t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an erasure code archiving method based on SPARK stream computing, which belongs to the field of computer storage. The present invention selects the data blocks to be archived from the nodes of the cluster and constructs multiple RDDs under the SPARK framework, uses the RDD as the basic unit to perform erasure code archiving, and the nodes where the data blocks in the RDD are located undertake the intermediate verification points of the respective erasure codes. The calculation task of the block, and adopts the pipeline form, starting from the first node to send the intermediate verification block to the rear node, using the intermediate verification block and computing power of the rear node to update the issued intermediate verification block, until The tail node generates the final verification block using the delivered intermediate verification block and sends it to the verification nodes of the cluster. The method of the present invention adopts the Map / Reduce model to perform erasure code archiving under the SPARK big data processing framework, and the archiving process is implemented in an assembly line manner, and the encoding calculation process is distributed to multiple nodes to complete, thereby greatly improving the archiving performance.

Description

technical field [0001] The invention belongs to the field of computer storage, and more specifically relates to an erasure code archiving method based on SPARK flow calculation. Background technique [0002] With the development of big data and information storage, memory capacity is increasing day by day, new storage media and new technologies are constantly updated, and the amount of data processed by applications is also greatly increasing, which affects the data fault tolerance rate of memory, storage security and space utilization. put forward higher requirements. The data in the distributed storage cluster is usually stored using a three-copy redundancy mechanism, and with the node recovery strategy, it can be quickly repaired after failure; compared with the three-copy redundancy mechanism, erasure codes have higher storage efficiency and Fault tolerance can be customized on demand. [0003] Apache Spark is a fast and general-purpose computing engine designed for la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06F16/27G06F16/182
CPCG06F9/5083G06F16/182G06F16/27
Inventor 黄建忠曹强谢长生蔡奇王爽汤思雨
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products