Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for removing duplicated files

A file and file identification technology, applied in the field of network communication, can solve the problems of low efficiency of deduplication, consumption of server resources, waste of server resources, etc., and achieve the effect of improving efficiency, shortening file identification information, and improving efficiency

Inactive Publication Date: 2011-09-14
BEIJING PEOPLE HAPPY INFORMATION TECH
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the prior art, the deduplication is carried out by comparing the content of the file to be stored by the user with the content of the file in the storage space, resulting in very low deduplication efficiency, and comparing the content of the file requires a large amount of server resources , especially files with a lot of content and a relatively large space, resulting in a serious waste of server resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for removing duplicated files
  • Method and system for removing duplicated files

Examples

Experimental program
Comparison scheme
Effect test

example 2

[0051]Example 2, for example, the grabbing server grabs a group of link addresses from the background server, and downloads the corresponding files through the link addresses. The identification information generation module 2 5 arranged on the crawling server generates corresponding file identification information according to the content of the downloaded file. The identification information sending module 2 arranged on the crawling server sends the generated file identification information to the application server, and the detection module 3 arranged on the application server detects the database server, that is, the "how to retrieve the database" that "Zhang San" will upload Compared with other file identification information stored in the database server, the information feedback module 4 set on the application server feeds back the corresponding detection result information to "Zhang San", that is, if there is a file in the database server that is related to " How to re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for removing duplicated files. The method comprises the steps of: S1, generating exclusive file identification information by a client side according to file contents; S2, sending the generated file identification information to an application server; S3, detecting whether the file identification information exists in a database server by the application server; and S4, feeding back detection result information by the application server. In the invention, corresponding file identification information is generated by an identification information generating module I according to the file contents, whether same file identification information exists in the database server is detected by using a detection module arranged on the application server, and short file identification information is only detected but redundant file contents are not detected, therefore, duplicated file removing efficiency is greatly improved, and server resources are greatly saved. Mass data statistics indicate that the duplicated file removing efficiency is improved by 30 percent after the method and system are adopted.

Description

technical field [0001] The invention relates to a method and system for sorting duplicate files, belonging to the field of network communication. Background technique [0002] Storing files is very important for computer networks. By storing useful files in the storage space, users can access and browse them. With the development of society, more and more useful files need to be stored, and the demand for storage space is also increasing. Therefore, an effective method is to sort the files to be stored, that is, to detect whether the storage space contains The file to be stored, if there is, the file will not be stored, if not, it will be stored. [0003] However, in the prior art, the deduplication is carried out by comparing the content of the file to be stored by the user with the content of the file in the storage space, resulting in very low deduplication efficiency, and comparing the content of the file requires a large amount of server resources , especially files w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 梁亮王剑清杨忠伟徐其斌
Owner BEIJING PEOPLE HAPPY INFORMATION TECH