Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

An online nearest neighbor query method for high-dimensional data based on hash learning

A high-dimensional data and query method technology, which is applied in database indexing, digital data processing, structured data retrieval, etc., can solve the problem of fast update frequency of hash model hash function, high computational overhead, and hash model stability. Weak and other problems, to achieve the effect of stable convergence of average accuracy, stable convergence of results, and fast query

Active Publication Date: 2019-02-01
NINGBO UNIV
View PDF9 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the update iteration process is frequent when recalculating the hash code, so that the computational overhead demand is too large as the data increases
Moreover, the hash model of the above method still has the problems of fast update frequency of the hash function and weak stability of the hash model in the online iterative learning process.
The reasons are: (1) design the loss function, and set the similar and dissimilar samples as a unified threshold on the entire data set; (2) only update the hash function according to the difference between two adjacent projection vectors as small as possible, which cannot guarantee the model stability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An online nearest neighbor query method for high-dimensional data based on hash learning
  • An online nearest neighbor query method for high-dimensional data based on hash learning
  • An online nearest neighbor query method for high-dimensional data based on hash learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] Below in conjunction with embodiment the present invention is described in further detail.

[0014] An online high-dimensional data nearest neighbor query method based on hash learning, comprising the following steps:

[0015] ① Image data acquisition and preprocessing: Obtain a data set containing the original two-dimensional image from the public image field website, convert the data set into a numerical matrix that retains the original features according to the image pixel information, and perform data cleaning on the numerical matrix and dimensionality reduction processing two-step operation, the specific operation process is:

[0016] ①-1 Perform normalization operations on the acquired image data to maintain the integrity of the overall data, use binning, clustering, and regression for outlier values ​​​​for manual processing, and replace the outlier image pixel data with the mean value;

[0017] ②-2 Use the SIFT algorithm to extract the local features in the ori...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an online high-dimensional data nearest neighbor query method based on hash learning, Firstly, the loss function is designed according to the similarity or dissimilarity of samples and the range of the loss function is enlarged, Then, a new objective function is proposed based on the principle that the hash model needs to keep the history information and minimize the loss of the current data. By analyzing the convergence of the online hash algorithm, the optimal value of the objective function is found. On this basis, for the data points to be queried, the closest datapoints can be quickly queried, the average accuracy results converge steadily, and the update of hash function in the iterative learning process is greatly reduced.

Description

technical field [0001] The invention relates to an online nearest neighbor query method, in particular to an online high-dimensional data nearest neighbor query method based on hash learning. Background technique [0002] Nearest Neighbor Search is an important research direction in the field of information retrieval, and it is widely used in image retrieval and data mining. The commonly used technologies for nearest neighbor query mainly include tree-based and hash-based methods. But when the data dimension becomes larger, the efficiency of tree-based neighbor retrieval will be greatly limited. The hash-based method compresses the original data into a low-dimensional binary code through a hash function, and then sorts and retrieves it under the Hamming distance, so this method has the advantages of being fast, efficient and dimensionally insensitive. Currently, the most researched hashing method is a batch processing method that uniformly trains all data. This method cann...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/22G06F16/2455
Inventor 胡伟钱江波任艳多孙瑶
Owner NINGBO UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products