Incomplete data weighted clustering method of adaptive intervals

A technology of complete data and clustering methods, applied in the fields of instruments, character and pattern recognition, computer parts, etc., can solve problems such as difficulties, impact of cluster analysis work accuracy, noise pollution, etc.

Pending Publication Date: 2019-09-03
LIAONING UNIVERSITY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the FCM method has certain limitations, and the incomplete data set must be completed to perform cluster analysis.
In daily life, incomplete data sets will be generated due to collection errors in the process of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Incomplete data weighted clustering method of adaptive intervals
  • Incomplete data weighted clustering method of adaptive intervals
  • Incomplete data weighted clustering method of adaptive intervals

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0139] One, the theoretical basis of the program of the present invention:

[0140] 1. Fuzzy C-means algorithm (FCM)

[0141] The FCM algorithm contains three basic operators: fuzzy membership function, partition matrix and objective function. First, establish the minimization objective function, and then use the idea of ​​iteration to optimize the minimization of the objective function, and finally judge which category you will belong to according to the degree of membership of each sample. Membership matrix U (c×n) For s-dimensional datasets where n represents the number of samples and x k =[x 1k ,x 2k ,...,x sk ] T , sample x k The jth attribute representation of is x jk , c represents the number of categories, and the element u of the membership matrix i j represents the membership degree of data xj to category i. For a certain sample, it cannot completely belong to a certain subcategory, nor can it completely not belong to a certain subcategory. For a certain s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an incomplete data weighted clustering method of the adaptive intervals. The method comprises the following steps of (1) determining a nearest neighbor sample; (2) filling themissing data in an adaptive interval manner; (3) proposing the adaptive interval type data weighted fuzzy C-means clustering; and (4) clustering the interval type data set obtained in the step (2) byusing the clustering method in the step (3) to obtain a clustering result, and comparing the clustering result with the experimental results provided by four classical methods and four scholars related to recent years so as to verify the effectiveness of the method disclosed by the invention. A biological data set iris, a medical data set breast cancer Breast and a medical data set adult liver disease Bpua in a UCI database are utilized, the experiments are carried out with four classical methods and four methods proposed by scholars related to recent years under the condition of four missingrates, so that the method is proved to have the higher clustering accuracy.

Description

technical field [0001] The invention relates to a method for fuzzy clustering of incomplete data, in particular to a method for weighted clustering of incomplete data in an adaptive interval. Background technique [0002] When people enter the information society, they also enter the data society. In order to accurately and efficiently process huge and complex data, cluster analysis has become an increasingly important data processing method. Many different data in the real world have no clear boundaries, and there is a certain degree of ambiguity. The traditional clustering analysis method belongs to hard division, and each data sample can only belong to or not belong to a certain cluster. Fuzzy C-means method (FCM) is the most widely used fuzzy clustering method among unsupervised classification methods. However, the FCM method has certain limitations, and the incomplete data set must be completed to perform cluster analysis. In daily life, incomplete data sets will be ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/23G06F18/22
Inventor 张利肖雪冬牛明航王军张皓博邱存月
Owner LIAONING UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products