SPARK streaming computation-based erasure code filing method

A streaming computing and erasure coding technology, applied in computing, program control design, multi-programming devices, etc., can solve the problem that the disk archiving scheme cannot meet the requirements, avoid network transmission and CPU computing bottlenecks, and maximize the system. Resource utilization, the effect of improving archiving performance

Active Publication Date: 2018-07-20
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, most researched archiving schemes in Hadoop clusters are performed on disks. With the expansion of memory in the cluster, and more and more applications are extended to the memory of the cluster for calculation, archiving on disks The program is far from meeting the needs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SPARK streaming computation-based erasure code filing method
  • SPARK streaming computation-based erasure code filing method
  • SPARK streaming computation-based erasure code filing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0027] Below at first explain and illustrate with regard to the technical terms that the present invention relates to:

[0028] RS erasure code: (k+r,k)RS erasure code strip is composed of k data blocks and r check blocks. The purpose of using RS code is to improve channel transmission efficiency by adding redundant codes. reliability;

[0029] RS encoding process: the process of calculating t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a SPARK streaming computation-based erasure code filing method, and belongs to the field of computer storage. The method comprises the following steps of: selecting to-be-fileddata blocks from nodes of a cluster, and building a plurality of RDDs under a SPARK framework; carrying out erasure code filing by taking the RDDs as basic units, wherein nodes where the data blocksin the RDDs are located undertake calculation tasks of middle verification blocks of respective erasure codes; issuing the middle verification blocks to rear nodes from the first node by adoption of an assembly line form; updating the issued middle verification blocks by utilizing the middle verification blocks of the rear nodes and calculation ability until a final verification block is generatedby a tail node by utilizing the issued middle verification block; and sending the final block to a verification node of the cluster. According to the method, erasure code filing is carried out by adoption of a Map / Reduce model under a SPARK big data processing framework, the filing process is realized by adoption of an assembly line manner, and the code calculation process is dispersed to a plurality of nodes to be completed, so that the filing performance is greatly improved.

Description

technical field [0001] The invention belongs to the field of computer storage, and more specifically relates to an erasure code archiving method based on SPARK flow calculation. Background technique [0002] With the development of big data and information storage, memory capacity is increasing day by day, new storage media and new technologies are constantly updated, and the amount of data processed by applications is also greatly increasing, which affects the data fault tolerance rate of memory, storage security and space utilization. put forward higher requirements. The data in the distributed storage cluster is usually stored using a three-copy redundancy mechanism, and with the node recovery strategy, it can be quickly repaired after failure; compared with the three-copy redundancy mechanism, erasure codes have higher storage efficiency and Fault tolerance can be customized on demand. [0003] Apache Spark is a fast and general-purpose computing engine designed for la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50G06F17/30
CPCG06F9/5083G06F16/182G06F16/27
Inventor 黄建忠曹强谢长生蔡奇王爽汤思雨
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products