A cloud storage file-level deduplication retrieval system and method

A deduplication and retrieval system technology, which is applied in digital data information retrieval, electronic digital data processing, special data processing applications, etc., can solve the problem of wasting cloud resources, achieve obvious deduplication effect, high deduplication rate, The effect of high execution efficiency

Active Publication Date: 2022-01-25
WUHAN WUTOS +1
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is to provide a cloud storage file-level file-level storage system for duplicate data in the cloud space in the prior art, which wastes precious cloud resources, generates additional overhead, and solves the problem of comparison efficiency of duplicate files. Data deduplication retrieval system and method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A cloud storage file-level deduplication retrieval system and method
  • A cloud storage file-level deduplication retrieval system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0034] Such as figure 1 As shown, the cloud storage file-level deduplication retrieval system of the embodiment of the present invention includes: a client, a cloud storage platform, a fingerprint server and a name server, and the cloud storage platform is composed of a plurality of data nodes; wherein:

[0035] Multiple data nodes are connected to the fingerprint server through the name server; the fingerprint server is used to store the feature information of the files in the data node; the client is used to send requests for searching and filtering files; Feature information performs coarse fil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cloud storage file-level deduplication retrieval system and method. In the method, the feature information of the file is stored by a fingerprint server. When a client submits a file storage application, coarse filtering is first performed, and a search is performed in the fingerprint server. If no file record with the same characteristics is found, the file is regarded as a new file; if found, fine filtering is performed, the found file set is regarded as a comparison file, and random points and feature intervals of the comparison file are sequentially selected , perform an accurate comparison to confirm whether the requested file exists, if yes, set the metadata of the requested file in the name server to point to the metadata of the compared file, if not, store the file, and record the characteristics of the file information to the fingerprint server. The present invention can greatly reduce the entry of repeated files through two steps of coarse and fine filtering, has the characteristics of high execution efficiency and high deduplication rate, and is suitable for big data and cloud storage environments.

Description

technical field [0001] The invention relates to the fields of deletion and retrieval of duplicate data in computer storage and cloud storage, and in particular to a file-level duplicate data deletion retrieval system and method for cloud storage. Background technique [0002] The rapid development of the Internet has produced massive amounts of data, leading to an increasing number of transmission and storage scenarios for massive data. In this context, data storage technology has developed rapidly, and deduplication and compression are technologies that can save a large amount of data storage. Data deduplication is to minimize the amount of data by identifying duplicate content, deduplicating it, and leaving pointers in the corresponding storage locations. At present, only a few primary storage arrays provide deduplication as an additional function of the product; duplicate data wastes valuable cloud resources and generates additional overhead, and it is reported that less ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/174
Inventor 董志勇邱琳赵航刘梦
Owner WUHAN WUTOS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products