Vector transformation for indexing, similarity search and classification

A vector and indexing technique used in the field of manipulating high-dimensional vector space data

Active Publication Date: 2013-09-04
GOOGLE LLC
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The magnitude of the mixed noise in the distance determination may exceed the magnitude of the change in the distance determination due to changes to a single vector dimension in a high-dimensional vector space
This is problematic in instances where it is desirable to measure the change in distance between vectors caused by a change in a small number of elements in the vector

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vector transformation for indexing, similarity search and classification
  • Vector transformation for indexing, similarity search and classification
  • Vector transformation for indexing, similarity search and classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] figure 1 is a block diagram of a media hosting service in which data processing operations are performed on vector data representing media objects, according to one embodiment. Media hosting service 100 represents a system, such as YOUTUBE TM system that stores and provides video and other media (such as images, audio, etc.) to clients, such as Client 135. The media hosting service 100 communicates with a plurality of content providers 130 and clients 135 via a network 140 to support sharing of media content among the entities. Media hosting service 100 may be implemented in a cloud computing network accessible to content providers 130 and clients 135 over network 140 . Note for clarity, figure 1 Only one example of a content provider 130 and client 135 is depicted, but in practice there will be a large number of content providers and clients. It should be noted that although the description herein primarily refers to media objects only for simplicity, the vector ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A feature vector is encoded into a sparse binary vector. The feature vector is retrieved, for example from storage or a feature vector generator. The feature vector represents a media object or other data object. One or more permutations are generated, the dimensionality of the generated permutations equivalent to the dimensionality of the feature vector. The permutations may be generated randomly or formulaically. The feature vector is permuted with the one or more permutations, creating one or more permuted feature vectors. The permuted feature vectors are truncated according to a selected window size. The indexes representing the maximum values of the permuted feature vectors are identified and encoded using one -hot encoding, producing one or more sparse binary vectors. The sparse binary vectors may be concatenated into a single sparse binary vector and stored.; The sparse binary vector may be used in the similarity search, indexing or categorization of media objects.

Description

[0001] Cross References to Applications [0002] This application claims priority to Provisional Application No. 61 / 412,711, filed November 11, 2010, which is hereby incorporated by reference. technical field [0003] The present disclosure relates generally to the fields of data indexing, similarity searching and classification, and more specifically to manipulating high-dimensional vector space data. Background technique [0004] Vectors are often used to represent the feature space of various phenomena. For example, vectors are used to characterize images, videos, audio clips, and other media. It should be noted that the utility of vector space operations is not limited to digital media, but can also be applied to other data, to physical objects, or to any other entity capable of feature representation. In media space, features include color distribution (e.g. using 4x4 pixel hue and saturation histograms), mean and variance of color density across color channels, colo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04N21/234G06F17/00
CPCG06F16/41G06V10/513
Inventor J·耶格尼克
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products