Hierarchical clustering method and device

A hierarchical clustering and clustering technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as excessive time and resources consumption, unfavorable data analysis, poor clustering results, etc., to reduce computing power. It can save computing time and resources, and the clustering results are reliable.

Active Publication Date: 2017-10-10
XIAOMI INC
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In related technologies, when implementing hierarchical clustering by calculating the distance between classes, the amount of calculation is too large. When there are many data objects contained in a class, it will consume too much time and resources. Moreover, because each class may contain Data objects that do not belong to this class, that is, noise. After using this data object to calculate the inter-class distance and form a new class, more noise may be introduced, resulting in poor clustering results, which is not conducive to subsequent data analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchical clustering method and device
  • Hierarchical clustering method and device
  • Hierarchical clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of apparatuses and methods consistent with aspects of the invention as recited in the appended claims.

[0053] Hierarchical clustering methods can be applied in various scenarios, such as clustering customer groups with different purchasing power in market analysis scenarios, and clustering organisms of different populations in biology, etc., especially Specifically, the embodiment of the present disclosure takes a scene of face recognition as an example to describe the hierarchical clustering met...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The disclosure relates to a method and device for hierarchical clustering, belonging to the field of data mining. The method includes: obtaining a set of data objects to be clustered, the set of data objects includes multiple classes, each class corresponds to at least one data object; clustering the data objects corresponding to the first class to obtain a clustering result, the The number of data objects corresponding to the first category exceeds the first preset threshold, the clustering result includes a plurality of clusters, and each cluster includes at least one data object; according to the clustering result, the data objects corresponding to the first category are processed Filter to obtain the representative data objects of the first category; calculate the inter-class distance based on the representative data objects of the first category and the corresponding data objects of the second category; perform hierarchical clustering on the data object set based on the inter-category distance. In the present disclosure, by clustering the data objects corresponding to the first category, the calculation amount is reduced, the calculation time and resources are saved, and the clustering result is more reliable, which is beneficial to subsequent data analysis.

Description

technical field [0001] The present disclosure relates to the field of data mining, in particular to a method and device for hierarchical clustering. Background technique [0002] In the field of data mining, it is usually necessary to analyze a large amount of data in order to obtain valuable analysis results. Clustering algorithm is an important algorithm for analyzing data in the field of data mining. This algorithm is used to classify a set of multiple data according to different categories of data. aggregated into one category to facilitate subsequent data analysis. Among them, hierarchical clustering is a commonly used clustering algorithm. [0003] When the related technology implements the method of hierarchical clustering, it calculates the distance between two classes, that is, the inter-class distance, so as to merge two classes whose inter-class distance is less than a certain value into a new class. Since each class may contain more than one data object, when ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F18/231
Inventor 陈志军代阳杨松
Owner XIAOMI INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products