Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Similarity analysis method, device and system

A similarity analysis and similarity technology, applied in relational databases, database models, instruments, etc., can solve problems such as grouping multi-node performance bottlenecks, achieve the effects of reducing data retrieval, improving performance, and reducing resource occupation

Active Publication Date: 2013-04-03
HUAWEI TECH CO LTD
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present invention provides a similarity analysis method, device and system to solve the problem that the existing similarity analysis becomes the performance bottleneck of group multi-node deduplication

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similarity analysis method, device and system
  • Similarity analysis method, device and system
  • Similarity analysis method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0069] figure 1 It is a schematic flowchart of a similarity analysis method provided by an embodiment of the present invention. like figure 1 As shown, the method includes:

[0070] 101. Acquire file fingerprint information of a file to be analyzed.

[0071] For example, the data deduplication engine (Data Deduplicate Engine, DDE for short) pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a similarity analysis method, device and system. The method comprises the following steps: getting file fingerprint information of a file to be analyzed; sending an analysis request carrying the file fingerprint information to at least two MDS (metadata servers) so as to enable the at least two MDS to respectively inquire all local file fingerprint information sets according to the file fingerprint information; selecting at least one sub-group according to analysis results returned by all the MDS, wherein each analysis result comprises the group number and the similarity of the at least one sub-group, which is inquired by the MDS and has the highest similarity with the file fingerprint information; and sending block fingerprint information in all data blocks in the file to be analyzed to the MDS to which the selected sub-group belongs so as to enable the MDS to perform repeated block inquiry in the selected local sub-group. Each MDS only needs to inquire the file fingerprint information sets in the sub-group for which the MDS is responsible, the data retrieval quantity is reduced and the waiting time for reading, writing and locking the file in a database can be reduced.

Description

technical field [0001] The embodiments of the present invention relate to the field of data storage, and in particular to a similarity analysis method, device and system. Background technique [0002] With the development of science and technology, the amount of information in society has increased dramatically. The amount of data that needs to be stored and the subsequent increase in storage capacity and storage costs have become important problems that enterprises need to consider. Data deduplication technology effectively reduces storage capacity requirements in scenarios such as data backup and saves storage costs by storing only a single instance of the same data that appears multiple times in the stored data. In the data deduplication technology, it has been proved to be an effective method to speed up the deduplication processing rate and improve the deduplication performance by adopting multi-node concurrent deduplication. [0003] In the multi-node deduplication so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F3/0608G06F3/0641G06F3/067G06F3/0611G06F16/1748G06F16/284
Inventor 黄焰
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products