Position difference-based high-precision nearest neighbor search algorithm

A search algorithm and high-precision technology, applied in the computer field, can solve problems such as time complexity deterioration, calculation complexity reduction, and loss of stability, and achieve the effects of improving accuracy, reasonable conception, and reducing accuracy errors

Inactive Publication Date: 2017-04-05
四川外国语大学重庆南方翻译学院
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of the above improved algorithm is that as the data dimension increases, its time complexity deteriorates sharply and it loses stability.
For example, Ra and Kim removed impossible data points by calculating the difference between the point to be classified and the average value of each known class point; Lai used triangle inequality and projection value to reduce the computational complexity; Xia et al. After the impact of the search algorithm, the nearest neighbor search algorithm (Location Difference of Multiple Distances Based Nearest Neighbors Searching Algorithm, referred to as the LDMDBA algorithm is proposed. The reference article is: (Xia S, Xiong Z, Luo Y, et al.Location difference of multiple distances based k-nearest neighbors algorithm[J].Knowledge-Based Systems,2015,90(C):99-110.)), the time complexity of the algorithm (O(logdnlogn)) is not only much lower than the FSA algorithm and Most other algorithms, and it has good stability on different high-dimensional data sets, but the disadvantage of this algorithm is that the classification accuracy of some data sets is slightly lower than that of the FSA algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Position difference-based high-precision nearest neighbor search algorithm
  • Position difference-based high-precision nearest neighbor search algorithm
  • Position difference-based high-precision nearest neighbor search algorithm

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0049] Example 1: On the premise that the number of neighbor points is set to 3, the running time of the three algorithms is shown in Table 2. As can be seen from Table 2, the present invention is based on the high-precision nearest neighbor search algorithm (being called for short HPLDBA algorithm) time of position difference far less than the nearest neighbor search algorithm (being called for short LDMDBA algorithm) and full search algorithm (abbreviating FSA algorithm) of high-dimensional distance position difference ), indicating that the high-precision nearest neighbor search algorithm based on position difference (HPLDBA algorithm for short) of the present invention has certain advantages in terms of efficiency.

[0050] Table 2 Comparison of the running time of the three algorithms on the public dataset (in ms)

[0051]

example 2

[0052] Example 2: The data set is from markov, its dimension is 10, and the value is between 0 and 1. The number of data points varies from 100 to 12800. Among them, the markov sequence {y i} is generated by the following formula:

[0053] the y i+1 +1=ay i + u

[0054] u is a random vector, each component value is between 0 and 1, and set y 0 =0, a=0.9. Experimental results such as Figure 4 shown. Depend on Figure 4 It can be seen that the HPLDBA algorithm is very close to the LDMDBA algorithm, and far smaller than the FSA algorithm.

[0055] in conclusion

[0056] On the basis of increasing the number of reference points, the present invention proposes a high-precision neighbor search algorithm based on position difference (HPLDBA algorithm for short). Compared with other similar algorithms, the prediction accuracy of this algorithm has obvious advantages, and the time complexity does not increase.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a position difference-based high-precision nearest neighbor search algorithm. The algorithm comprises the steps of setting the value of an i-th component of an i-th reference point in the high dimension distance position difference factors as -1, and setting the values of other components as 1; setting all unit vectors of which the lengths are 1 as the reference points; calculating the distances Disi between the i-th reference point to all data points; ranking the distances Disi, and generating an ordered sequence; calculating the accurate Euclidean distances between a sample point A to a subsequence of which the length is 2k*epsilon, wherein the epsilon is a length adjusting factor of the subsequence; applying a partial ranking algorithm on the obtained distance values to obtain k minimum Euclidean distances; if the nearest neighboring points of all data points to which the reference points are applied are calculated, calculating the high dimension distance position difference factors of all data points and a terminal point, otherwise, enabling i=i+1, and returning to the first step. Under the premise of not increasing the time complexity, the position difference-based high-precision nearest neighbor search algorithm of the present invention enables the precision to be improved, and retains the advantages of being independent of the indexes in a high dimension data set, being efficient and online, etc.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a high-precision neighbor search algorithm based on position differences. Background technique [0002] The k-nearest neighbor search algorithm is used to search the k nearest neighbor points closest to a certain point in the data set. At present, this algorithm has been widely used in many fields such as feature selection, pattern recognition, clustering, noise detection and classification. Among them, as an early k-nearest neighbor search algorithm, the full search algorithm (Full Search Algorithm, referred to as FSA algorithm) is to determine the k nearest neighbor points by calculating the Euclidean distance from the point to be classified to each known category point, so its time complexity higher (O(n 2 )), and poor adaptability. Aiming at the defects of the FSA algorithm, many scholars have proposed algorithms to reduce its time complexity. These algorithms can be rou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/24147
Inventor 杨柳毕孝儒贾小林
Owner 四川外国语大学重庆南方翻译学院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products