Unlock instant, AI-driven research and patent intelligence for your innovation.

Data deduplication method and device

A data and target data technology, applied in the field of data statistics, to improve the comparison efficiency, reduce the amount of data, and reduce the amount of data storage

Inactive Publication Date: 2021-07-27
CHINA ACADEMY OF INFORMATION & COMM
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For other data deduplication applications, similar problems may also exist

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method and device
  • Data deduplication method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solution of the present invention more clearly, so they are only examples, and should not be used to limit the protection scope of the present invention.

[0023] It should be noted that, unless otherwise specified, the technical terms or scientific terms used in this application shall have the usual meanings understood by those skilled in the art to which the present invention belongs.

[0024] Such as figure 1 As shown, this embodiment provides a method for deduplication of data, including:

[0025] Step S1, constructing the longest common substring table according to the acquired target data.

[0026] Wherein, the target data is in a string format, and each target data refers to an object. For example, if the object is an application, the target data may be ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of data statistics, and in particular relates to a method and device for deduplication of data. The data deduplication method of the present invention includes: constructing the longest common substring table according to the acquired target data; extracting the longest common substring of two data that needs to be deduplicated; The substrings in the longest common substring table are compared; if there is no substring identical to the longest common substring in the longest common substring table, then the two data are deduplicated deal with. The data deduplication method and device of the present invention do not need to frequently update the data in the table, which reduces the amount of data storage and improves the efficiency of data comparison in the deduplication process.

Description

technical field [0001] The invention relates to the technical field of data statistics, in particular to a method and device for deduplication of data. Background technique [0002] The applications on the mobile application store may have duplicate problems, and there is a need for deduplication; or when performing data analysis on applications in different mobile application stores, it is also necessary to deduplicate the same application. [0003] There is a problem with multiple names for the same app. For example, the same application may adopt different names in different time periods, for example, video application software may add the name of the latest popular drama to the application name. For another example, the same app may use different names in different app stores, such as Tencent QQ and QQ. Of course, there are other situations that cause the same app to have different names. [0004] Different applications also have the problem of similar names (existing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/215
CPCG06F16/215
Inventor 路博王跃方诗旭张育雄郭丽杨小燕刘艺
Owner CHINA ACADEMY OF INFORMATION & COMM