Multi-channel quantification and hierarchical clustering method based on multi-core parallel computation

A hierarchical clustering and clustering technology, applied in the field of cluster analysis, to achieve accurate clustering results, better clustering results, and a wide range of applications

Inactive Publication Date: 2012-09-12
XI AN JIAOTONG UNIV
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But how to effectively cluster large-scale data brings opportunities and challenges to researchers

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-channel quantification and hierarchical clustering method based on multi-core parallel computation
  • Multi-channel quantification and hierarchical clustering method based on multi-core parallel computation
  • Multi-channel quantification and hierarchical clustering method based on multi-core parallel computation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be further described in detail below in conjunction with specific embodiments, which are explanations of the present invention rather than limitations.

[0038] Such as figure 1 As shown, it is a schematic flow diagram of efficient clustering of large-scale data in the present invention, including the following execution steps:

[0039] First, step 01 is executed, and the random sampling algorithm is applied in the hierarchical sampling module to perform hierarchical sampling on the original data. Applying the random sampling algorithm to the original data to carry out hierarchical sampling is to carry out hierarchical sampling to the original data, the sampling series is determined by the series to be clustered, and the data volume of the sampled data increases step by step;

[0040] The hierarchical sampling is to read the original data scattered in each file and store them in a file to form a data set, and then apply a random sampling algo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-channel quantification and hierarchical clustering method based on multi-core parallel computation, which comprises the following steps of: 1) hierarchically sampling original data; 2) clustering sampling data at a first level by using a K-mean clustering algorithm; 3) quantifying sampling data at a next level to all centers at the current level by using a multi-channel quantification method under a parallel framework; 4) respectively clustering the quantified grouped data at the current level by using an adaptive K-mean clustering algorithm under the parallel framework; and 5) combining clustering centers in all groups at the current level into a cluster of large centers. Since the method provided by the invention adopts hierarchical sampling and parallel adaptive clustering methods, the clustering time is reduced; and since the method provided by the invention adopts the multi-channel quantification method, the clustering accuracy is improved.

Description

technical field [0001] The invention belongs to the technical field of cluster analysis, and relates to a multi-channel quantitative hierarchical clustering method based on multi-core parallelism. Background technique [0002] In recent years, the development of computer technology and communication technology has brought great changes to human civilization, and people can obtain data in a faster and more convenient way; and this momentum will continue to develop rapidly, making the amount of data exponential level growth. However, while having large-scale data, we lack sufficient understanding and application of the information contained in the data, and the application of traditional data analysis methods can no longer meet the current data analysis and processing requirements. The rich information contained in the database has not been fully explored and utilized, which makes us fall into an embarrassing situation where data is rich but information is scarce. People are...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F9/38
Inventor 廖开阳刘贵忠乔珍刘超腾肖莉
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products