Repeating data deleting algorithm in cloud storage

A technology for deduplication and duplication of data, applied in digital data processing, input/output process of data processing, calculation, etc., can solve problems such as retrieval troubles, waste of cloud resources, etc., to avoid accidental deletion and omission, deletion High accuracy and good detection performance

Inactive Publication Date: 2017-05-03
SICHUAN YONGLIAN INFORMATION TECH CO LTD
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Similarly, for users who lease cloud space, a large amount of duplicate data is flooded in the cloud space, wh

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Repeating data deleting algorithm in cloud storage
  • Repeating data deleting algorithm in cloud storage
  • Repeating data deleting algorithm in cloud storage

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0019] Step 1: Data chunking

[0020] In cloud storage, it is divided into three roles, the client is responsible for collecting user information and command generation, the server is responsible for managing user operations and requests, that is, is responsible for identifying data redundancy, and the cloud space is responsible for storing user information. In this algorithm, the data must first be divided into blocks. Assuming that the information F is composed of several files, and each file is divided into blocks as a complete data block, then the complete data information flow is obtained as follows:

[0021]

[0022] Among them, t 0 , t g respectively represent the stagnation steps of the individual extremum and the global extremum of the data block boundary offset; T 0 , T g Respectively represent the stagnation step thresholds of the individual extremum and the global extremum that need to be disturbed.

[0023] Step 2: Generate a verification information storag...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a repeating data deleting algorithm in cloud storage. Energy aggregation and noise inhibition of repeating data information flow of a cloud storage system are realized by four-order cumulant slicing, repeating data detection post filtering processing is carried out, and information flow feature encoding of a plurality of threads is created to delete the repeating data. The repeating data deleting algorithm provided by the invention avoids wrong deletion and miss deletion caused by interference features of the data information flow; the performance for detecting the repeating data in the cloud storage system is good; the accuracy of deleting the repeating data is high; and the comprehensive performance is better than that of the traditional algorithm.

Description

technical field [0001] The invention relates to the fields of computer storage, deletion and retrieval of repeated data in cloud storage, and signal processing. Background technique [0002] With the development of information technology and network technology, big data and massive data have become the main business of data centers, and deduplication and compression are technologies that can save a large amount of data storage. Backup alone is not enough; deduplication and compression will soon become a must-have feature for primary storage. Data deduplication is a compression technique that minimizes the amount of data by identifying duplicate content, deduplicating it, and leaving a pointer in the corresponding storage location; this pointer is created by hashing a data pattern of a given size. At present, only a few primary storage arrays provide deduplication as an additional function of the product; it is reported that less than 5% of the disk arrays really support onl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F3/06
CPCG06F3/0608G06F3/0641G06F3/0652G06F3/067
Inventor 范勇
Owner SICHUAN YONGLIAN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products