Fast approximate K neighbor method based on tree strategy and balanced K-means clustering

A technology of K-means and K-nearest neighbors, which is applied to instruments, character and pattern recognition, computer components, etc., can solve the problems of low algorithm efficiency and achieve the elimination of uncertainty, reduction of search time, strong robustness and practicability Effect

Active Publication Date: 2019-07-30
NORTHWESTERN POLYTECHNICAL UNIV
View PDF7 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, K-nearest neighbor retrieval in large-scale high-dimensional data has always been one of the hot issues that are difficult to overcome. In the tree structure method, KD tree, KD random forest, etc. have good results, but in the KD tree algorithm, The retrieval process needs to go back to the previous node continuously. The higher

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast approximate K neighbor method based on tree strategy and balanced K-means clustering
  • Fast approximate K neighbor method based on tree strategy and balanced K-means clustering
  • Fast approximate K neighbor method based on tree strategy and balanced K-means clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] The present invention will be further described below in conjunction with the accompanying drawings and embodiments, and the present invention includes but not limited to the following embodiments.

[0013] Such as figure 1 As shown, the present invention provides a fast approximate K-nearest neighbor method based on the tree strategy and balanced K-means clustering, which mainly consists of two main steps of building a balanced tree and finding K-nearest neighbors. The basic implementation process is as follows:

[0014] 1. Building a Balanced Tree

[0015] First, the input data set is clustered using the balanced K-means clustering algorithm, and the cluster centers of the two types of samples with equal sample numbers are obtained. Specifically:

[0016] The two types of balanced K-means clustering algorithm models are as follows:

[0017]

[0018] Among them, C is the center of the cluster, G is the index matrix, and X is the input data set, where the i-th row...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a fast approximate K neighbor method based on a tree strategy and balanced K-means clustering, so as to improve the performance and speed of approximate K neighbor retrieval. The method comprises the following steps: firstly, constructing a balanced K-means tree through a balanced K-means clustering method, enabling data to be organized in a tree-shaped structure efficientlyand orderly, and realizing rapid positioning of any new sample data; then, utilizing an anchor positioning method and a neighbor cluster idea, and quickly finding a plurality of approximate neighborpoints, namely K neighbor points, of the new data sample through a balance tree. According to the method, the advantages of a tree-based K-nearest neighbor algorithm and a balanced K-means algorithm are taken into account at the same time, and the method can be applied to the fields of image recognition, data compression, mode recognition and classification, machine learning, document retrieval systems, statistics, data analysis and the like.

Description

technical field [0001] The invention belongs to the technical field of machine learning and data mining, and in particular relates to a fast approximate K nearest neighbor method based on tree strategy and balanced K-means clustering. Background technique [0002] In the era of mobile Internet, people's daily life is faced with the impact of massive data every day, such as personal information, video records, image collection, geographic information, log files, etc., in the face of such a large and growing data information, how to process all The effective storage, indexing and querying of the required information is a hot research topic both at home and abroad. [0003] Approximate K-nearest neighbor retrieval was initially applied to document retrieval systems as a method for finding similar document information, and then in geographic information systems, K-nearest neighbor retrieval was also widely used in location information, query, analysis and statistics of spatial d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/23213G06F18/24147G06F18/24323
Inventor 聂飞平车昊轩王宏王榕于为中李学龙
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products