Unlock instant, AI-driven research and patent intelligence for your innovation.

Data deduplication method and device, equipment and medium

A data and database technology, applied in the computer field, can solve the problems of missing deduplication processing and restricting the efficiency of sample storage.

Active Publication Date: 2019-11-15
TENCENT TECH (SHENZHEN) CO LTD
View PDF11 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This is because, once a new sample to be stored is received during the above-mentioned deduplication process, the newly received sample to be stored will be missed because the previous batch of samples to be stored is being compared with the sample library. In this way, if there is similar sample data between the newly received samples to be stored and the previous batch of samples to be stored, it is likely to be As a result, similar sample data appears in subsequent sample databases, resulting in repeated content recommendations
However, in the existing technology, it is necessary to receive new samples to be stored after the deduplication process between the samples to be stored and the sample library is completed, which greatly restricts the improvement of the efficiency of sample storage

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method and device, equipment and medium
  • Data deduplication method and device, equipment and medium
  • Data deduplication method and device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] At present, in order to avoid the situation where the recommendation system recommends duplicate content to users, it is necessary to ensure that there is no similar sample data in the sample database in the recommendation system. For this reason, the existing solution is to perform deduplication processing between the above-mentioned samples to be stored and the samples in the current sample library after obtaining a certain number of samples to be stored, and in this deduplication process, It is not allowed to receive new samples to be stored until the above-mentioned deduplication process is completed. However, this greatly restricts the improvement of the efficiency of sample storage. In view of this, this application provides a data deduplication scheme, which can effectively improve the efficiency of sample storage while avoiding similar sample data in the sample database.

[0034] For ease of understanding, the system architecture to which the technical solution ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data deduplication method and device, equipment and a medium. The method comprises the steps of obtaining a sample library sent by a server to obtain a local sample library;obtaining a target request and adding the target request to a request queue in a preset database; obtaining a target sub-queue sent by a preset database; wherein the requests in the target sub-queue are all requests in front of the target request in the current request queue; judging whether a target sample corresponding to the target request is similar to a sample corresponding to the target sub-queue or not, if so, forbidding to write the target sample into a local sample library, and if not, judging whether the target sample is similar to a sample in the local sample library or not; and ifthe samples are similar to the samples in the local sample library, prohibiting writing into the local sample library, and if not, writing into the local sample library. By means of the scheme, the write-in requests can be obtained in parallel, corresponding duplicate removal processing is conducted, and therefore the effect of effectively improving the sample storage efficiency under the condition that similar sample data are prevented from appearing in the sample library is achieved.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a data deduplication method, device, equipment and medium. Background technique [0002] In an existing recommendation system, a sample library provided by a content center is usually stored. The recommendation system recommends content to users based on the sample data stored in the above sample database. [0003] In order to prevent the recommendation system from recommending repeated content to users, it is necessary to ensure that there is no similar sample data in the above sample database. In the prior art, in order to achieve this effect, a commonly used solution is to perform deduplication processing between the above-mentioned samples to be stored and the samples in the current sample library after obtaining a certain number of samples to be stored, and in this During the deduplication process, no new samples to be received are allowed to be received, and n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F16/25
CPCG06F16/215G06F16/254
Inventor 常郅博李阳
Owner TENCENT TECH (SHENZHEN) CO LTD