Compression and decompression method based on hardware accelerator card on distributive-type file system

A distributed file and hardware acceleration technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as occupying large CPU resources, system processing capacity decline, etc., and achieve the effect of improving effective bandwidth

Active Publication Date: 2013-04-03
中科天玑数据科技股份有限公司
View PDF7 Cites 52 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For traditional data compression or decompression methods using software, such as GZip, although this method can reduce the storage overhead of the system, it will occupy a large amount of CPU resources during the compression or decompression process, which may cause system processing Decrease in ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Compression and decompression method based on hardware accelerator card on distributive-type file system
  • Compression and decompression method based on hardware accelerator card on distributive-type file system
  • Compression and decompression method based on hardware accelerator card on distributive-type file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] This embodiment implements a compression prototype based on a hardware accelerator card based on Apache HDFS (Hadoop Distributed File System). HDFS is an open source implementation of Google GFS, which is the basis of various projects in the Hadoop ecosystem.

[0044] HDFS upper-layer applications use clients to write or read files. A file in HDFS is divided into multiple files of the same size, and the size of the last file block may be smaller than other file blocks. Different file blocks belonging to the same file may be stored on different data nodes, and each data block has three copies in the data nodes.

[0045] see figure 1, a system structure diagram of a compression and decompression method based on a hardware accelerator card on a distributed file system. The compression and decompression method provided in this embodiment is to complete the compression or decompression of file blocks by invoking a hardware accelerator card, and the compression or decompres...

Embodiment 2

[0073] see Image 6 , a schematic diagram of the system structure of another compression and decompression method based on a hardware accelerator card in a distributed system. This embodiment is a further extension of Embodiment 1. In addition to being used on HDFS clients and data nodes, the hardware accelerator card can also be used for upper-layer applications. For HDFS upper-level applications, such as distributed databases, distributed data warehouses, MapReduce frameworks, and other applications that require data storage, the hardware accelerator card can be called independently to compress or decompress the data in the form of data streams, and then the processed The data stored in the distributed file system, local file system, transmitted over the network or used for other purposes. The compression method library drives the hardware accelerator card to compress or decompress through the driver program, and can create data input streams and output streams. The stream ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a compression and decompression method based on a hardware accelerator card on a distributive-type file system, which belongs to the technical field of a distributive-type file system. The method comprises the steps of firstly fragmenting a file block when a client writes data to a data node, compressing fragments through the hardware accelerator card, and transmitting the compressed fragments to the data node; and firstly retrieving all fragments containing reading data from the data node when the client reads the data from the data node, calling the hardware accelerator card to decompress and combine the fragments, and transmitting the combined data to a superior application. The superior application can independently utilize the hardware accelerator card to compress or decompress the data in a data flow form. The hardware accelerator card is used in the distributive-type file system and the superior application, so that the data compression or decompression can be realized only through small cache, a great amount of central processing unit (CPU) resource consumed by a traditional compression method can be unloaded, and complete transparency for users of the system can be realized.

Description

technical field [0001] The invention relates to the technical field of distributed file systems, in particular to a compression and decompression method based on a hardware accelerator card on a distributed file system. Background technique [0002] With the advent of the data age, the amount of data to be processed by the Internet is increasing. In order to ensure high data reliability, the current distributed file system generally adopts a multi-copy strategy. However, in a large-scale cluster, this will bring a huge storage overhead that cannot be ignored. At the same time, for systems or applications on the distributed file system, such as distributed databases, distributed data warehouses, MapReduce frameworks or other applications, it is also possible Redundant data will be generated, making the data expansion rate higher, and I / O performance has become an increasingly obvious bottleneck of the system. It is difficult for the existing distributed file system to meet th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘佳胡肖查礼
Owner 中科天玑数据科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products