Unlock instant, AI-driven research and patent intelligence for your innovation.

An Algorithm for Discovering Clusters and Outliers Based on Natural Shared Nearest Neighbor Search

A technology of nearest neighbors and outliers, which is applied in computing, computer components, character and pattern recognition, etc., can solve the problem that the termination conditions of search algorithms are not scientific enough, the detection accuracy of outliers is not high, and the data clustering effect is not good, etc. question

Inactive Publication Date: 2019-09-20
CHINA AGRI UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the existing natural nearest neighbor algorithm, the definition of natural neighbors and the termination condition of the search algorithm are not scientific enough, resulting in poor data clustering effect and low detection accuracy of outliers. Based on this, the present invention proposes a method based on natural The algorithm for discovering clusters and outliers in the shared nearest neighbor search, in which the definition of natural neighbors is optimized to form the definition of shared nearest neighbors, and the search termination conditions are improved to make the neighbor relationships found more scientific, so that the clustering results It is more in line with the real distribution of the data, and the accuracy of the detected outliers is higher

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Algorithm for Discovering Clusters and Outliers Based on Natural Shared Nearest Neighbor Search

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The present invention will be described in detail below in conjunction with the accompanying drawings. figure 1 It is a flowchart of an algorithm for discovering clusters and outliers based on natural shared nearest neighbor search in the present invention.

[0026] An algorithm for discovering clusters and outliers based on natural shared nearest neighbor search, characterized in that the specific steps of the algorithm are

[0027] Step 1. Search the data set for natural nearest neighbors. When it is found that the number of points that do not share the nearest neighbors in the data set no longer changes, the search ends, and the number n of searched nearest neighbors is obtained; according to the proposed definition of natural shared neighbors, calculate each The natural shared nearest neighbor relationship obtained by the object under n-nearest neighbors;

[0028] Step 2. The natural neighbor search algorithm based on the shared nearest neighbor determines the natu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of data mining, in particular to an algorithm for discovering clusters and outliers based on natural sharing nearest neighbor search. It is characterized in that, firstly, the data set is searched for natural nearest neighbors, and when the number of points that do not share the nearest neighbors in the data set is found to be no longer changed, the search ends, and the number n of the searched nearest neighbors is obtained; according to the proposed definition of natural shared neighbors, calculate The natural shared nearest neighbor relationship obtained by each object under the n-nearest neighbor; then the natural shared nearest neighbor relationship of each object is determined based on the shared nearest neighbor natural neighbor search algorithm, and the data is aggregated according to the natural shared nearest neighbor relationship Class and outlier discrimination. In the algorithm of the present invention, a new shared nearest neighbor relationship and natural neighbor search termination condition are proposed, which solves the poor clustering effect and outlier points caused by the insufficient definition of the natural neighbor relationship and the insufficient scientific search conditions in the existing algorithm The detection accuracy is not high.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to an algorithm for discovering clusters and outliers based on natural sharing nearest neighbor search. Background technique [0002] With the explosive growth of data and the continuous development of big data technologies such as cloud computing, people pay more and more attention to data mining technology. The mining of clusters and outliers is a very important technology in data mining, which can help to find valuable information, so as to effectively analyze the data. [0003] At present, there is a natural nearest neighbor algorithm, which does not require the user to specify the number of nearest neighbors, and naturally forms a neighborhood relationship to cluster the data. There are also algorithms for outlier detection based on clustering. However, in the existing natural nearest neighbor algorithm, the definition of natural neighbors and the termination condition of the search...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62
CPCG06F18/23
Inventor 高红菊刘艳哲储汪兵刘继文
Owner CHINA AGRI UNIV