Improved K-means clustering algorithm based on density radius

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A k-means clustering and radius technology, applied in the field of clustering algorithms, can solve problems such as inaccurate selection of k value, sensitivity to noise and outliers, etc.

Inactive Publication Date: 2018-09-18

成都康乔电子有限责任公司 +1

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The technical problem to be solved in the present invention is: provide a kind of improved K-means clustering algorithm based on density radius, solve the local optimum solution that existing K-means clustering algorithm exists, be sensitive to noise and outlier point, k Inaccurate value selection problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0055] Embodiment provides a kind of improved K-means clustering algorithm based on density radius, comprises the steps:

[0056] 1. Data set preparation, assuming that there are m sample points in the data set, each sample point is v dimension, where v∈Z * . The data set is denoted as T={n 1 ,n 2 ,...,n m}, where n i Represents the sample point, m represents the number of sample points, sample point n i The coordinates are marked as (x i,1 ,x i,2 ,...,x i,v ), v represents the dimension;

[0057] 2. Data preprocessing: use the lof method to remove noise and outliers;

[0058] 3. Normalize the data: Divide the coordinates of each dimension of the sample point by the maximum value of the coordinates of the sample point in the corresponding dimension. The calculation formula is shown in (1), so that the normalized sample coordinate x i.j ∈[0,1],

[0059]

[0060] 4. After normalization, calculate the Euclidean distance between all sample points, where the i-th samp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention, which relates to the field of clustering algorithms, discloses an improved K-means clustering algorithm based on density radius so that problems that a local optimal solution exists, the sensitivity to the noises and outliers is high, and the k value selection is not accurate of the existing K-means clustering algorithm can be solved. All sample points are ranked according to the density radius, the sample point with the largest density radius is used as an initial value, the above-mentioned steps are repeated, all the initial points and the category number k are selected, and clustering operation is stated; two centroids at nearest distances are selected among the clustered category centroids, the categories of the two centroids are taken separately and viewed as a dichotomy, a Bayesian score of the dichotomy is calculated, the two categories are combined into one, a Bayesian score after combination is calculated, whether the two categories need to be combined is determined based on the score, and the above-mentioned steps are repeated until no combination is needed. The clustering algorithm is suitable for big data clustering processing.

Description

technical field [0001] The invention relates to the field of clustering algorithms, in particular to an improved K-means clustering algorithm based on density radius. Background technique [0002] Clustering is to divide some physical or abstract objects into several clusters according to the similarity between objects, so that the data in the same cluster has a high similarity, and the data in different clusters are similar Sex is low. Clustering is an unsupervised learning method that classifies unlabeled data without prior information. The K-means algorithm is the most commonly used typical partitioning algorithm in cluster analysis. This algorithm divides the data according to a certain similarity measurement method, so that the distance between each data and the centroid of the cluster to which it belongs is as small as possible. , the algorithm is widely used because of its simplicity and high efficiency. But at the same time, it also has some defects, such as the n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62

CPCG06F18/23213

Inventor 万思思刘丹王永松伍功宇

Owner 成都康乔电子有限责任公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Improved K-means clustering algorithm based on density radius

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology