Shadow rough fuzzy clustering method based on Mahalanobis distance

A fuzzy clustering method and Mahalanobis distance technology, applied in the field of data processing, can solve the problems of abnormal uncertainty of data, unsatisfactory division of boundary areas, etc., and achieve the effect of improving the uneven distribution of samples and effective cluster division.

Pending Publication Date: 2019-12-24
NANJING UNIV OF INFORMATION SCI & TECH
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to overcome the problems that the FCM algorithm and the like cannot well handle the cluster division with uneven sample distribution, the division of the boundary area is not ideal, and the uncertainty of data abnormality, and the fuzzy set is combined with the rough set, Aiming at the threshold selection problem, the shadow set is introduced, and a Mahalanobis distance-based rough and fuzzy clustering method for shadows is provided, which is specifically implemented by the following technical solutions:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Shadow rough fuzzy clustering method based on Mahalanobis distance
  • Shadow rough fuzzy clustering method based on Mahalanobis distance
  • Shadow rough fuzzy clustering method based on Mahalanobis distance

Examples

Experimental program
Comparison scheme
Effect test

example

[0132] There are 4 UCI datasets, 9-dimensional Glass dataset, 13-dimensional Wine dataset, 8-dimensional Pima dataset and 13-dimensional Heart disease dataset. The Glass dataset has a total of 214 samples with 6 clusters of 70, 76, 17, 13, 9 and 29 samples. The Wine dataset has a total of 178 samples and has 3 clusters with 59, 71 and 48 samples respectively. The Pima dataset contains 768 samples and has 2 clusters with 500 and 268 samples respectively. The Heart disease dataset has a total of 270 samples and has 2 clusters with 150 and 120 samples respectively. Simulation experiments are carried out on the real data sets Glass, Wine, Pima and Heart disease. The comparison of the misclassification rates of the four algorithms in the real data set clustering is shown in Table 2.

[0133] Table 2

[0134]

[0135]

[0136] After analysis, MSRFCM algorithm has the lowest misclassification rate compared with MFCM and FCM algorithms on 4 real data sets. The MFCM algorith...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

According to the shadow rough fuzzy clustering method based on the Mahalanobis distance, the similarity measurement method of the Mahalanobis distance is adopted, correlation between attributes is eliminated, and the shadow rough fuzzy clustering method is suitable for any cluster division; through the Mahalanobis distance, the correlation between attributes is eliminated, and the difference of clustering importance of the attributes is reflected. A rough set and a shadow set are combined, the method is suitable for processing noise data and abnormal data, unbalanced sample distribution is improved, and the defect of a fuzzy C mean value in the aspect of ambiguity is overcome; meanwhile, according to the division of the core region and the boundary region in the cluster and the advantagesof the Mahalanobis distance, more effective cluster division can be generated.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a Mahalanobis distance-based rough and fuzzy clustering method for shadows. Background technique [0002] Clustering analysis is a statistical method to classify similar or similar objects of a group of objects studied into clusters, so that the differences between objects within each class are small, and the differences between objects between classes are relatively large. With the development of natural science and social science, clustering methods have been widely used in data mining, pattern recognition and machine learning. Fuzzy clustering establishes a description of the uncertainty of data samples for categories, expresses the ambiguity of sample categories, objectively reflects the real world, and has strong clustering effects and data processing capabilities. Fuzzy C-means algorithm (FCM) is the most classic fuzzy clustering algorithm based on objective functi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/00
CPCG06N3/006G06F18/23213
Inventor 王丽娜邢梓萌王杰邓乾
Owner NANJING UNIV OF INFORMATION SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products