Unlock instant, AI-driven research and patent intelligence for your innovation.

Deduplication processing method and device

A processing method and technology of a processing device, which are applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as increasing computational complexity, and achieve the effect of avoiding repeated calculation processes and reducing computational complexity.

Active Publication Date: 2016-04-27
BEIJING QIHOO TECH CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] In view of the above problems, the de-duplication processing method and corresponding de-duplication processing device of the present invention are proposed to solve the problem that a vector is clustered into multiple classes, resulting in repeated calculations between these classes, which increases the complexity of calculations. degree technical issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deduplication processing method and device
  • Deduplication processing method and device
  • Deduplication processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0021] figure 1 A flowchart of a deduplication processing method 100 according to an embodiment of the present invention is shown. like figure 1 As shown, the method 100 starts from step S101, and in step S101, at least two classes including multiple vectors are first obtained, and the at least two classes include the same vector to be processed. Optionally, the method 100 is a subsequent processing method for the MinHash algorit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a duplicate removal treatment method and a device, wherein the duplicate removal treatment method comprises the following steps: respectively calculating at least two kinds of center vectors; respectively calculating the distances between to-be-handled vectors and the at least two kinds of center vectors; according to the distances between the treatment vectors and the at least two kinds of center vectors, determining the classes of the to-be-handled vectors, and deleting the to-be-handled vectors which are not included in the classes, and obtaining at least two classes after duplicate removal treatment. Due to the duplicate removal treatment method, the purpose that one vector only exists in one class is achieved. After the treatment of the duplicate removal treatment method and the device, subsequent repeated calculation process is avoided, and computation complexity is reduced.

Description

technical field [0001] The invention relates to the technical field of computer processing, in particular to a deduplication processing method and device. Background technique [0002] As the saying goes: "Birds of a feather flock together, and people are divided into groups." In natural sciences and social sciences, there are a lot of classification problems. The so-called class, in layman's terms, refers to a collection of similar elements. Cluster analysis, also known as group analysis, is a statistical analysis method for studying the classification of samples or indicators. Cluster analysis originated from taxonomy. In ancient taxonomy, people mainly rely on experience and professional knowledge to achieve classification, and rarely use mathematical tools for quantitative classification. With the development of human science and technology, the requirements for classification are getting higher and higher, so that sometimes it is difficult to classify accurately only ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 齐路何锐邦唐会军
Owner BEIJING QIHOO TECH CO LTD