Unlock instant, AI-driven research and patent intelligence for your innovation.

Noise suppression-oriented method for determining optimal clustering number according to clustering effectiveness index

A technology for clustering effectiveness and noise suppression, applied in the fields of instruments, character and pattern recognition, computer components, etc., can solve the problem of reducing the accuracy of clustering effectiveness indicators, optimize clustering results, and improve clustering accuracy. Effect

Pending Publication Date: 2021-07-23
XIAN UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, many experts and scholars have proposed index functions to test the effectiveness of clustering, such as Dunn index, DB index, CH index, etc., and they use these index functions to calculate the optimal number of clusters k opt , but due to the difference in the internal structure of the data in the noisy environment and the non-noisy environment, the accuracy of the clustering effectiveness index usually decreases in the noisy environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Noise suppression-oriented method for determining optimal clustering number according to clustering effectiveness index
  • Noise suppression-oriented method for determining optimal clustering number according to clustering effectiveness index
  • Noise suppression-oriented method for determining optimal clustering number according to clustering effectiveness index

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0070] The method for determining the optimal number of clusters based on the noise suppression-oriented clustering effectiveness index of the present invention is specifically implemented according to the following steps:

[0071] Step 1. Determine the cluster number range of the actual problem to be clustered. The actual problem here can be the online learning data generated by the current large-scale online education, the large amount of commodity transaction data generated by online shopping, and the data generated by intelligent transportation. A large amount of traffic information, etc. (but not limited to this), and obtain k initial cluster center sets;

[0072] Step 1 is specifically implemented according to the following steps:

[0073] Step 1.1. For the data set X formed by the actual problem of clustering, determine the range of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a noise suppression-oriented method for determining an optimal clustering number according to a clustering effectiveness index, which is specifically implemented according to the following steps of: 1, determining a clustering number range of an actual problem to be clustered, the actual problems can be online learning data generated by current large-scale online education, a large amount of commodity transaction data generated by online shopping, a large amount of traffic information generated by intelligent traffic and the like (but not limited to this), and k initial clustering center sets are obtained; step 2, recalculating the centroid of the clustered cluster, updating a clustering center set, and then re-dividing the cluster; 3, if the clustering center does not change any more, it is represented that the clustering process is finished, and similarity factors are calculated according to the clustering effectiveness indexes facing noise suppression; and step 4, obtaining an optimal clustering result. According to the method, the clustering number can be determined more accurately in a noise data environment.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to a method for determining the optimal number of clusters based on a noise suppression-oriented clustering effectiveness index. Background technique [0002] Data mining is becoming more and more popular in big data analysis, and it can meet people's needs for in-depth information. Clustering is one of the hottest research directions in data mining, which refers to the process of grouping data into multiple clusters so that the data in the same cluster are as similar as possible, while the data in different clusters are as different as possible. As a traditional clustering algorithm, the K-means algorithm is widely used because of its simplicity, speed, and ease of implementation, as well as its ability to maintain good scalability and high efficiency when dealing with problems with a large amount of data information. application. Although the principle of the K-m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
CPCG06F18/23213
Inventor 张亚玲蔡忱
Owner XIAN UNIV OF TECH