A method of compression and decompression based on hardware accelerator card on distributed file system

A distributed file and hardware acceleration technology, which is applied in the fields of instruments, computing, and electrical digital data processing, etc., can solve problems such as occupying large CPU resources and reducing system processing capacity, and achieve the effect of improving effective bandwidth

Active Publication Date: 2018-07-31
中科天玑数据科技股份有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For traditional data compression or decompression methods using software, such as GZip, although this method can reduce the storage overhead of the system, it will occupy a large amount of CPU resources during the compression or decompression process, which may cause system processing Decrease in ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method of compression and decompression based on hardware accelerator card on distributed file system
  • A method of compression and decompression based on hardware accelerator card on distributed file system
  • A method of compression and decompression based on hardware accelerator card on distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] This embodiment implements a compression prototype based on a hardware accelerator card based on Apache HDFS (Hadoop Distributed File System). HDFS is an open source implementation of Google GFS, which is the basis of various projects in the Hadoop ecosystem.

[0044] HDFS upper-layer applications use clients to write or read files. A file in HDFS is divided into multiple files of the same size, and the size of the last file block may be smaller than other file blocks. Different file blocks belonging to the same file may be stored on different data nodes, and each data block has three copies in the data nodes.

[0045] see figure 1, a system structure diagram of a compression and decompression method based on a hardware accelerator card on a distributed file system. The compression and decompression method provided in this embodiment is to complete the compression or decompression of file blocks by invoking a hardware accelerator card, and the compression or decompres...

Embodiment 2

[0073] see Figure 6 , a schematic diagram of the system structure of another compression and decompression method based on a hardware accelerator card in a distributed system. This embodiment is a further extension of Embodiment 1. In addition to being used on HDFS clients and data nodes, the hardware accelerator card can also be used for upper-layer applications. For HDFS upper-level applications, such as distributed databases, distributed data warehouses, MapReduce frameworks, and other applications that require data storage, the hardware accelerator card can be called independently to compress or decompress the data in the form of data streams, and then the processed The data stored in the distributed file system, local file system, transmitted over the network or used for other purposes. The compression method library drives the hardware accelerator card to compress or decompress through the driver program, and can create data input streams and output streams. The stream...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a compression and decompression method based on a hardware accelerator card on a distributed file system, and belongs to the technical field of the distributed file system. The technology includes: when the client writes data to the data node, the file block is fragmented first, and then the fragment is compressed by the hardware accelerator card, and the compressed fragment is sent to the data node; the client reads from the data node When the data is processed, first retrieve the fragments containing the read data from the data node, call the hardware accelerator card to decompress and combine the fragments, and send the combined data to the upper-layer application. The upper layer application can independently use the hardware accelerator card to compress or decompress the data in the form of data stream. The technology proposed by the present invention uses a hardware accelerator card in the distributed file system and upper-layer applications, and only needs a small cache to realize data compression or decompression, which can unload a large amount of CPU resources consumed by traditional compression methods, and has great impact on the system. It is completely transparent to the user.

Description

technical field [0001] The invention relates to the technical field of distributed file systems, in particular to a compression and decompression method based on a hardware accelerator card on a distributed file system. Background technique [0002] With the advent of the data age, the amount of data to be processed by the Internet is increasing. In order to ensure high data reliability, the current distributed file system generally adopts a multi-copy strategy. However, in a large-scale cluster, this will bring a huge storage overhead that cannot be ignored. At the same time, for systems or applications on the distributed file system, such as distributed databases, distributed data warehouses, MapReduce frameworks or other applications, it is also possible Redundant data will be generated, making the data expansion rate higher, and I / O performance has become an increasingly obvious bottleneck of the system. It is difficult for the existing distributed file system to meet th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 刘佳王锐坚查礼程学旗
Owner 中科天玑数据科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products