Selective clustering integration method based on data stability

An integrated method and stable technology, applied to instruments, character and pattern recognition, computer components, etc., can solve problems such as dependence on prior knowledge, lack of adaptability, and low optimization efficiency

Inactive Publication Date: 2018-09-25
SOUTH CHINA UNIV OF TECH
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to overcome the problems of lack of adaptability, dependence on prior knowledge, and low optimization efficiency in traditional clustering integration selection methods, and propose a selective clustering integration method based on data stability, which can effectively improve clustering diversity, can automatica

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Selective clustering integration method based on data stability
  • Selective clustering integration method based on data stability
  • Selective clustering integration method based on data stability

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] The present invention will be further described below in conjunction with specific examples.

[0074] Such as figure 1 As shown, the selective clustering integration method based on data stability provided in this embodiment uses a variety of clustering algorithms to generate clustering results, and performs two-layer result screening, which includes the following steps:

[0075] 1) Use the IRIS data set on the UCI Repository official website as the test data set, and perform normalization operations on it:

[0076] where i∈{1,2,...,N},k∈{1,2,...,F}

[0077] Among them, the number of samples of the test set N=150, the number of features of the test data set F=3, is the value of the k-th feature of the i-th sample of the test data set, X(k)min is the minimum value of the kth feature of the test data set, X(k) max is the maximum value of the kth feature of the test data set.

[0078] 2) Collect random subspaces for the test data set, use different clustering algori...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a selective clustering integration method based on data stability. The method comprises the following steps of 1) inputting a data set and carrying out preprocessing; 2) carrying out clustering result set generation on the data set; 3) carrying out clustering result screening and acquiring a clustering subset; 4) carrying out sample division, and dividing the data set intoa stable subset and an unstable subset; 5) making a target function based on the stable subset and the unstable subset, and further screening the clustering subset; and 6) fusing the final clusteringsubset and acquiring a clustering result. Compared with a traditional method, the method has the following innovation points that multi-view clustering is realized so as to enhance diversities; an appropriate clustering algorithm is automatically screened and a problem that a data assumption does not match is avoided; the target function based on data stability is designed and high adaptability isachieved; and through an index increase degree, a multi-target genetic algorithm convergence direction is controlled and a convergence speed and accuracy are increased.

Description

technical field [0001] The invention relates to the technical field of computer artificial intelligence, in particular to a selective clustering integration method based on data stability. Background technique [0002] Clustering analysis is an important and challenging problem in machine learning and data mining. The goal of clustering is to group similar samples into the same class, but different clustering algorithms have different assumptions about the data, and a single algorithm is difficult Handle complex feature representation problems. Clustering integration solves the above problems very well, so it is widely used. By fusing multiple clustering results with diversity and accuracy, the clustering effect can often be greatly improved, but there are many clustering results Noise members, if not removed, will affect the performance of clustering integration, the present invention mainly solves the problem of clustering integration selection. [0003] In traditional c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/23213G06F18/25
Inventor 余志文黄炜杰
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products