Locality-sensitive-hashing-based high-dimensional indexing method for large-scale multimedia data

A locally sensitive hash and multimedia data technology, applied in the field of multimedia indexing and retrieval, can solve problems such as small data volume, inability to be widely used, disk IO performance problems, etc., and achieve improved efficiency, increased speed, effective indexing and retrieval Effect

Active Publication Date: 2014-12-10
PEKING UNIV
View PDF5 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, one problem with multi-detection local sensitive hashing is that the algorithm can only store the index table in memory, and the amount of data it can support is small, so it cannot be widely used in current large-scal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Locality-sensitive-hashing-based high-dimensional indexing method for large-scale multimedia data
  • Locality-sensitive-hashing-based high-dimensional indexing method for large-scale multimedia data
  • Locality-sensitive-hashing-based high-dimensional indexing method for large-scale multimedia data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0043] A high-dimensional indexing method for large-scale multimedia data based on local sensitive hashing of the present invention, its process is as follows figure 1 As shown, it specifically includes the following steps:

[0044] (1) Extract high-dimensional features of multimedia data

[0045] Extract one or more features for multimedia data that needs to be indexed, including images, audio or video, etc. The features are high-dimensional vectors, such as extracting features such as color, texture or shape for images, and extracting short-term average energy and zero-crossing for audio Rate, MEL frequency cepstral coefficient and other features to extract key points, objects or motion features from the video.

[0046] (2) Establish memory index

[0047] After extracting the features of the image, video or audio, we first create...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a locality-sensitive-hashing-based high-dimensional indexing method for large-scale multimedia data. The method includes the following steps of extracting high-dimensional features of the multimedia data at the offline indexing stage; establishing an internal storage index, storing the multimedia high-dimensional features in a feature storage area, calculating the locality sensitive hashing vectors of the high-dimensional features, and storing feature numbers and the locality sensitive hashing vectors corresponding to the features in a hashing list storage area, wherein the internal storage index comprises the feature storage area and the hashing list storage area; establishing a first-stage disk index, wherein the first-stage disk index comprises a feature storage area, an index storage area and a plurality of hashing list storage areas; establishing a second-stage disk index, wherein the second-stage disk index comprises a hashing barrel storage area; repeatedly executing the steps mentioned above till all multimedia input is indexed. At the online query stage, features of the multimedia data used for queries are extracted, the queries are conducted on the basis of the established indexes, and similar query results are returned. By means of the method, the scheduling performance of internal storage and disks is improved, and the indexing speed and the retrieving speed of the multimedia data are increased.

Description

technical field [0001] The invention belongs to the technical field of multimedia indexing and retrieval, and in particular relates to a high-dimensional indexing method for large-scale multimedia data based on local sensitive hashing. Background technique [0002] In recent years, with the rapid development and popularization of Internet technology, especially the continuous promotion and application of social networking sites and image and video sharing sites, the number of images, audio and video on the Internet has shown a rapid growth trend. How to quickly and accurately retrieve the information users need from massive multimedia data has become an important problem to be solved urgently. Traditional text-based multimedia retrieval methods directly use the text information in web pages, which may not directly describe the multimedia content itself, so the accuracy is not high. Content-based image, audio and video retrieval can effectively overcome the above deficiencie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F3/0638G06F16/41
Inventor 彭宇新彭云波张健
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products