Data deduplication method, device, electronic device, and computer-readable storage medium

A data and data block technology, applied in the field of data processing, can solve the problems of IO performance degradation, data deduplication cannot be realized, and massive data cannot be searched for duplicate data, etc., to achieve the effect of saving storage space

Active Publication Date: 2022-04-29
ALIBABA CLOUD COMPUTING LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Currently, Windows 2012 implements deduplication of data within a single disk on the New Technology File System (NTFS, New Technology File System), but this implementation method has the following defects: 1. Data deduplication can only be realized within a single disk, not Global data deduplication; 2. Does not support searching for duplicate data in massive data; 3. Searching for duplicate data will cause IO performance to decline

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method, device, electronic device, and computer-readable storage medium
  • Data deduplication method, device, electronic device, and computer-readable storage medium
  • Data deduplication method, device, electronic device, and computer-readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0090] Hereinafter, exemplary embodiments of embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily realize them. Also, for clarity, parts not related to describing the exemplary embodiments are omitted in the drawings.

[0091] In the embodiments of the present invention, it should be understood that terms such as "comprising" or "having" are intended to indicate the presence of features, numbers, steps, acts, components, parts or combinations thereof disclosed in this specification, and are not intended to The possibility that one or more other features, numbers, steps, acts, parts, parts or combinations thereof exist or be added is excluded.

[0092] In addition, it should be noted that, in the case of no conflict, the embodiments of the present invention and the features in the embodiments can be combined with each other. The embodiments of the present invention will be descr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the present invention discloses a data deduplication method, device, electronic equipment, and computer-readable storage medium. The method includes: obtaining a data container to be processed; The data container whose data similarity meets the preset condition is used as the target data container; the data container to be processed is compared with the target data container, and duplicate data is confirmed and deleted in the post-processing process. This technical solution can realize deduplication of massive data on a global scale, and achieve the purpose of saving storage space without reducing user IO performance.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of data processing, and in particular to a data deduplication method, device, electronic device, and computer-readable storage medium that can be executed in a post-processing flow. Background technique [0002] With the development of data technology, users have higher and higher requirements for high-performance storage, especially in cloud computing block device storage, such as higher read and write times per second (IOPS) and lower latency (Latency) , Because of this, the cost of high-performance storage has also increased significantly, such as all-flash storage arrays, non-volatile memory host controller interface specification solid-state hard disk storage (NVME SSD, Non-Volatile Memory Express Solid State Disk) and so on. In this case, it becomes very meaningful if the space occupied by storage can be saved without reducing the performance of user input and output (IO, Input Ou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F3/06G06F16/13G06F16/17G06F16/174G06F16/182
CPCG06F3/0608G06F3/0641G06F3/067G06F16/1734G06F16/1748G06F16/134G06F16/182
Inventor 佘海斌
Owner ALIBABA CLOUD COMPUTING LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products