Check patentability & draft patents in minutes with Patsnap Eureka AI!

Reference point K-nearest neighbor classification method based on MPI (Message Passing Interface) parallelization

A classification method and k-nearest neighbor technology, applied in inter-program communication, instruments, multi-program devices, etc., can solve the problems of high time complexity, poor algorithm performance, and increased tree structure complexity, and achieve high classification accuracy. Effect

Active Publication Date: 2018-06-01
CHONGQING UNIV OF POSTS & TELECOMM
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this type of algorithm is that as the dimension of the data set increases, the performance of these tree-structured algorithms gradually deteriorates. The reason is that high-dimensional data sets will increase the complexity of the tree structure, which will lead to the establishment of tree structures and search for neighbors. Points and increased time taken to calculate distances
The kNN algorithm is a commonly used data mining algorithm, but its time complexity is high and the classification speed is slow.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reference point K-nearest neighbor classification method based on MPI (Message Passing Interface) parallelization
  • Reference point K-nearest neighbor classification method based on MPI (Message Passing Interface) parallelization
  • Reference point K-nearest neighbor classification method based on MPI (Message Passing Interface) parallelization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0035] The k-nearest neighbor algorithm based on reference points uses the distance between sample points and several reference points to measure the position difference between each other. The idea is to set several reference points, calculate the similarity between training samples and reference points and generate an ordered similarity sequence, Then according to the similarity between the test sample and the reference point, the approximate neighbor samples in the training set are searched from the ordered sequence, and the exact similarity with the test sample is calculated from these approximate neighbor samples, so as to find k nearest neighbor samples and judge the category. Its core strategy is to greatly reduce the scope of searching for training set samples with the help of reference points.

[0036] Define the Location Differenc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a reference point K-nearest neighbor classification method based on MPI (Message Passing Interface) parallelization, and belongs to the field of data classification. The method comprises the following steps that: S1: on the basis of a reference point K-nearest neighbor algorithm, utilizing a distance between a sample point and a plurality of reference points to measure a position difference between the sample point and the reference points, defining a position difference factor, calculating a similarity between a training sample and the reference point, and generatingan ordered similarity sequence; S2: according to the similarity between the training sample and the reference point, searching a nearest neighbor sample in the training set from the ordered sequence;and S3: calculating a definite similarity with the testing sample from the searched nearest neighbor sample so as to find k neighbor samples and judge categories. By use of the method, in virtue of reference points, the searching of the k neighbor is quickened, an MPI technology is used for realizing parallelization, and therefore, the classification speed of large-scale high-dimension data is quickened.

Description

technical field [0001] The invention belongs to the field of data classification and relates to a reference point k nearest neighbor classification method based on MPI parallelization. Background technique [0002] Classification is an important technology in the field of data mining. Its purpose is to construct a classification model (also known as classification function, classifier) ​​according to the characteristics of the data set, which can map samples of unknown categories to certain categories in a given category. one or several. The k-nearest neighbor algorithm was originally proposed by Cover and Hart in 1968. It is a non-parametric classification technique that has the advantages of robustness, clear concept, and easy implementation, and can achieve high classification accuracy for unknown and non-normal distributions. Rate. [0003] The traditional k-nearest neighbor algorithm has a high time complexity. At present, many scholars have proposed many improved alg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/54G06K9/00G06K9/62
CPCG06F9/546G06V10/94G06F18/24147
Inventor 陈子忠梁聪夏书银
Owner CHONGQING UNIV OF POSTS & TELECOMM
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More