Fast database search systems and methods

A technology of database and computer system, applied in the field of fast database search system and method, can solve problems such as unaffordable, achieve the effect of improving recall, reducing recall, and high performance

Active Publication Date: 2019-08-23
GOOGLE LLC
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, computing the inner product via a linear scan requires O(nd) time and memory, which is unaffordable when the number (n) and dimensionality (d) of database vectors are large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast database search systems and methods
  • Fast database search systems and methods
  • Fast database search systems and methods

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] figure 1 is a block diagram of a scalable inference system implemented by example. The system 100 can be used to hierarchically quantify a database of items and compute inner products with query vectors to find related database items for use in applications such as recommendation systems, classification in machine learning algorithms, and other systems using nearest neighbor computation. The system 100 jointly learns codebooks for hierarchical levels and reduces the processing time required to perform inner product searches while still maintaining high quality results. figure 1 The depiction of system 100 in is described as a server-based search system. However, other configurations and applications may be used. For example, some operations can be performed on the client device. Furthermore, while system 100 is described as a search system, the disclosed implemented methods and techniques can be used for any task that uses maximum inner product, such as in neural net...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Implementations provide an efficient system for calculating inner products between high-dimensionality vectors. An example method includes clustering database items represented as vectors, selecting acluster center for each cluster, and storing the cluster center as an entry in a first layer codebook. The method also includes, for each database item, calculating a residual based on the cluster center for the cluster the database item is assigned to and projecting the residual into subspaces. The method also includes determining, for each of the subspaces, an entry in a second layer codebook for the subspace, and storing the entry in the first layer codebook and the respective entry in the second layer codebook for each of the subspaces as a quantized vector for the database item. The entry can be used to categorize an item represented by a query vector or to provide database items responsive to a query vector.

Description

[0001] Cross References to Related Applications [0002] This application is a continuation of, and claims priority from, U.S. Application No. 15 / 290,198, filed October 11, 2016, entitled "HIERARCHICAL QUANTIZATION FOR FASTINNER PRODUCT SEARCH," the disclosure of which is incorporated herein by reference in its entirety. Background technique [0003] Searching very large high-dimensional databases is a challenging task that can involve significant processing and memory resources. Many search tasks involve computing the inner product of a query vector and a set of database vectors to find the database instance with the largest or maximum inner product (eg, highest similarity). This is the Maximum Inner Product Search (MIPS) problem. However, computing the inner product via a linear scan requires O(nd) time and memory, which is unaffordable when the number (n) and dimensionality (d) of database vectors are large. Contents of the invention [0004] The implementation provides...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F17/10G06K9/62
CPCG06F17/10G06F16/3347G06F16/35G06F16/24537G06F16/285G06F16/2237G06F16/24578G06F18/231G06F18/24137G06F18/23213
Inventor S.库马D.M.西姆查A.T.苏雷什R.郭X.于D.霍尔特曼-瑞丝
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products