Unlock instant, AI-driven research and patent intelligence for your innovation.

Similarity analysis method, device and system

A similarity analysis and similarity technology, applied in the field of similarity analysis methods, devices and systems, can solve problems such as grouping multi-node performance bottlenecks, achieve the effects of reducing the amount of data retrieval, improving performance, and reducing resource occupation

Active Publication Date: 2016-01-06
HUAWEI TECH CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present invention provides a similarity analysis method, device and system to solve the problem that the existing similarity analysis becomes the performance bottleneck of group multi-node deduplication

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similarity analysis method, device and system
  • Similarity analysis method, device and system
  • Similarity analysis method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0069] figure 1 It is a schematic flowchart of a similarity analysis method provided by an embodiment of the present invention. Such as figure 1 As shown, the method includes:

[0070] 101. Acquire file fingerprint information of a file to be analyzed.

[0071] For example, the data deduplication engine (DataDeduplicateEngine, DDE for short) p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the present invention provide a similarity analysis method, device and system. The method includes: obtaining the file fingerprint information of the file to be analyzed; sending an analysis request carrying the file fingerprint information to at least two MDSs, so that the at least two MDSs respectively query the local file fingerprints according to the file fingerprint information information set; select at least one group according to the analysis results returned by each MDS, and the analysis results include the group number and similarity of at least one group with the highest similarity to the file fingerprint information found by the MDS; The block fingerprint information of each data block in the analyzed file is sent to the MDS to which the selected group belongs, so that the MDS performs a local repeated block query in the selected group. Each MDS only needs to query the file fingerprint information set that it is responsible for grouping, which reduces the amount of data retrieval and the waiting time for reading and writing locks on database files.

Description

technical field [0001] The embodiments of the present invention relate to the field of data storage, and in particular to a similarity analysis method, device and system. Background technique [0002] With the development of science and technology, the amount of information in society has increased dramatically. The amount of data that needs to be stored and the subsequent increase in storage capacity and storage costs have become important problems that enterprises need to consider. Data deduplication technology effectively reduces storage capacity requirements in scenarios such as data backup and saves storage costs by storing only a single instance of the same data that appears multiple times in the stored data. In the data deduplication technology, it has been proved to be an effective method to speed up the deduplication processing rate and improve the deduplication performance by adopting multi-node concurrent deduplication. [0003] In the multi-node deduplication so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F3/0608G06F3/0641G06F3/067G06F3/0611G06F16/1748G06F16/284
Inventor 黄焰
Owner HUAWEI TECH CO LTD