Method and device of processing repeated data

A technique for duplicating data and processing methods, applied in the field of data processing, which can solve the problems of system bandwidth occupation, ineffective use of bandwidth, and large impact on read and write performance of SSD main storage arrays, etc., to achieve the goal of reducing the impact of read and write performance Effect

Active Publication Date: 2011-06-08
CHENGDU HUAWEI TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of implementing the present invention, the inventors found the following problems in the prior art: in method one, the data can only be written into the SSD after the deduplication operation is completed, and the data to be written resides in the memory, which cannot be effectively Therefore, the data deduplication operation has a greater impact on the read and write performance of the SSD primary storage array.
In the second method, since a separate thread needs to be started to perform the data deduplication operation, additional input / output (Input / Output, hereinafter referred to as: I / O) overhead is added, which causes the system bandwidth to be occupied, and also makes the deduplication operation harmful. The read and write performance of the SSD primary storage array is greatly affected

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device of processing repeated data
  • Method and device of processing repeated data
  • Method and device of processing repeated data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] figure 1 It is a flow chart of Embodiment 1 of the repetitive data processing method of the present invention. Such as figure 1 As shown, Embodiment 1 of the present invention provides a repeated data processing method, the method comprising:

[0026] Step 100, receiving a read request for reading data in a physical block, where the read request includes information of a mapping table corresponding to the physical block;

[0027] Step 101, write the data in the physical block into memory according to the information of the mapping table to read the data in the physical block;

[0028] Step 102, performing a data deduplication operation on the data already written in the physical block in the memory.

[0029] Specifically, after receiving the read request for reading the data in the physical block in the SSD hard disk, write the data in the physical block into the memory according to the information in the mapping table, wherein the data in the physical block is not p...

Embodiment 2

[0032] The duplicate data processing method provided by Embodiment 2 of the present invention is based on the basis of Embodiment 1 above, and the difference is that: Optionally, the mapping table corresponding to the physical block includes a MAP value and a flag bit.

[0033] figure 2 It is a schematic diagram of the mapping table corresponding to the physical block in the SSD hard disk in Embodiment 2 of the repeated data processing method of the present invention. Such as figure 2 As shown, each SSD hard disk can be divided into 32K units to establish physical blocks, and the mapping table corresponding to the physical blocks includes MAP values ​​and flag bits. Among them, the first three bits of the mapping table are flag bits representing the state of the physical block, and the flag bits are respectively: the first flag bit is a reuse flag bit, representing whether the data in the physical block is reused, and the second flag bit Insert the index table flag, repres...

Embodiment 3

[0067] Image 6 It is a schematic structural diagram of Embodiment 3 of the repetitive data processing device of the present invention. Such as Image 6 As shown, Embodiment 3 of the present invention provides a repetitive data processing device, which includes: a receiving module 1 , a writing module 2 and a processing module 3 .

[0068] The receiving module 1 is configured to receive a read request for reading data in a physical block, where the read request includes information of a mapping table corresponding to the physical block;

[0069] The writing module 2 is used to write the data in the physical block into the memory to read the data in the physical block according to the information of the mapping table;

[0070] The processing module 3 is configured to perform a data deduplication operation on the data written in the physical block in the memory.

[0071] In the repeated data processing device provided by Embodiment 3 of the present invention, after the receiv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a method and a device of processing repeated data. The method comprises the steps: receiving a reading request used for reading data in a physical block, wherein the reading request comprises the information of a mapping table which corresponds to the physical block; writing the data in the physical block into a memory according to the information of the mapping table; and deleting repeated data written into the physical block in the memory. When the reading request is received, the thread of deleting the repeated data is touched by the reading requestfor changing the thread of deleting the repeated data into a passive mode; the operation of deleting the repeated data has less influence on reading and writing the data; and moreover, extra I / O expense is not needed. The data in the physical block are directly written into the memory without the need of CACHE processing, thereby reducing the influence of the operation of deleting the repeated data on the reading and writing performance of an SSD main memory array.

Description

technical field [0001] The invention relates to data processing technology, in particular to a repeated data processing method and device. Background technique [0002] A solid-state hard drive (Solid-state hard drive, hereinafter referred to as SSD) has been applied to a primary storage array because of its high performance. However, due to the high price of SSD storage media, it is necessary to make full use of the storage space in the SSD during use, and the data stored in the SSD may have data duplication. Duplicate data will occupy the storage space in the SSD, so you need to pass Data deduplication technology deletes duplicate stored data in SSD. [0003] In the prior art, there are many methods for data deduplication, for example: Method 1, the synchronous method, also called the in-band method, that is, when the data in the memory is written into the SSD, the data first resides in the memory , and then call the deduplication thread to identify the data to be writte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F12/06
Inventor 梁尚冬
Owner CHENGDU HUAWEI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products