Unlock instant, AI-driven research and patent intelligence for your innovation.

Retrieval method and system based on Laplacian operator and LSH technology

An operator and technology technology, applied in the field of machine learning and large-scale high-dimensional data retrieval applications, can solve the problems of restricting the application of local sensitive hash retrieval methods, difficult space division, and difficulty adapting to the diversity of data distribution, etc., to achieve accurate approximation Nearest neighbor query, improving the effect of single distribution form and good recall rate

Active Publication Date: 2021-10-22
FUJIAN NORMAL UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, looking at the current search solutions based on locality-sensitive hashing, there are still four problems that restrict its further application
[0004] (1) It is difficult to set algorithm parameters: most methods need to set parameters according to specific data, and these parameters usually require manual intervention, such as the number of clusters in DSH, w in E2LSH, σ in GLDH, etc.;
[0005] (2) It is difficult to adapt to the distribution diversity of data: most LSH related algorithms are only suitable for data with specific distribution characteristics, and the adaptability is weak, which restricts the application of retrieval methods based on local sensitive hashing;
[0006] (3) The performance needs to be further improved before it can be practical: the LSH retrieval algorithm based on deep learning improves the query accuracy of neighbor data, but the consumption of preprocessing time greatly limits its application; although the traditional LSH-based retrieval algorithm has poor performance dominant, but its efficiency is still uneven and needs to be further improved;
[0007] (4) Space division is difficult to consider globally: some algorithms have large errors in space division, for example, PCAH divides data along the principal component direction, and RHPLSH randomly divides data. These two methods cause Large segmentation error, DSH reduces the segmentation error to a certain extent, but its solution is local and lacks the consideration of the global angle

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Retrieval method and system based on Laplacian operator and LSH technology
  • Retrieval method and system based on Laplacian operator and LSH technology
  • Retrieval method and system based on Laplacian operator and LSH technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application

[0045] Such as figure 1 Shown, the present invention is based on the retrieval method of Laplacian operator and LSH technology, and it comprises the following steps:

[0046] Step 1. Generate k hash functions to form a hash function cluster. The generation process of each hash function is to project the data onto a random vector conforming to the Gaussian distribution, according to the projected Gaussian kernel probability density distribution and the Gaussian kernel Laplacian operator The calculated second derivative of the projection determines the offset;

[0047] Step 2, the data storage process uses the hash function cluster to calculate the hash codes...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a retrieval method and system based on a Laplacian operator and an LSH technology, and the method comprises the steps: firstly, projecting data to a randomly generated normal vector through employing the characteristic that the Laplacian operator is especially sensitive to the dramatic change of a function, converting the projection into the probability density distribution of the data through employing a Gaussian kernel density function, and applying a Gaussian kernel Laplacian operator to projection data to solve a second derivative of density distribution, so that a violent change position of data projection distribution is found to serve as an offset of a hyperplane. According to the method and system, the efficiency, the precision and the recall rate can be considered at the same time, the method has good adaptability, the adaptability of the locality sensitive hashing method to various distributions of large-scale high-dimensional data retrieval is further expanded, and the application requirements of various distribution characteristic data can be met.

Description

technical field [0001] The invention relates to the application field of machine learning and large-scale high-dimensional data retrieval, in particular to a retrieval method and system based on Laplacian operator and LSH technology. Background technique [0002] With the development of data collection and network technology, all walks of life generate massive amounts of data all the time. The sources and meanings of these data are different, and they are high-dimensional and diverse. For example, Environmental Wireless Sensor Networks (EWSN) is widely used in environmental monitoring, and it collects a variety of data at the same time. characteristic high-dimensional massive data environment. To make full use of these data and provide support for decision-making, higher requirements are put forward for the fast and accurate retrieval of high-dimensional and massive data. Therefore, building a large-scale high-dimensional data index structure with good performance will be ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22G06F16/2458G06F16/248
CPCG06F16/2255G06F16/2458G06F16/248Y02D10/00
Inventor 张仕赖会霞
Owner FUJIAN NORMAL UNIV