High-dimension vector rapid searching algorithm based on block distance

A block distance and retrieval algorithm technology, applied in the field of data processing

Inactive Publication Date: 2012-01-04
COMMUNICATION UNIVERSITY OF CHINA
View PDF5 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Block distance is one of the most commonly used measurement methods in high-dimensional vector similarity matching algorithms. Its operation is simple and has high retrieval efficiency. However, most of the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-dimension vector rapid searching algorithm based on block distance
  • High-dimension vector rapid searching algorithm based on block distance
  • High-dimension vector rapid searching algorithm based on block distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The specific embodiment of the present invention will be further described below in conjunction with accompanying drawing:

[0032] The technical scheme of this embodiment is as figure 1 As shown in (a):

[0033] First, select a reference point from the high-dimensional vector set; then calculate the block distance between each high-dimensional vector in the high-dimensional vector set and the reference point one by one, and obtain the key value corresponding to each high-dimensional vector; then combine each high-dimensional vector and The corresponding key value is inserted to obtain BlockB-tree (such as figure 1 As shown in (b), the upper layer is B + -tree, each key value of the leaf node layer is bound to a pointer to the corresponding high-dimensional vector). When searching, calculate the block distance between the query vector and the reference point, get the query key value, and locate the position where the query key value should be inserted in the BlockB-t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a high-dimension vector rapid searching algorithm based on a block distance and belongs to the field of data processing such as multimedia information searching, intelligent information processing, data mining, and the like. In the invention, an index structure Block B-tree which is converted from high dimension to one dimension and is based on the block distance is provided; a high-dimension vector is mapped into one-dimensional key values by adopting the block distance of the high-dimension vector to a reference point; and the index structure B+-tree is used for managing the key values, and each key value of a leaf node layer is bound with a pointer pointing to a corresponding high-dimension vector. During searching, the same mapping method is used for mapping a query vector into one-dimension query key values, and then similarity calculation is only performed on the high-dimension characteristics of the key values close to the query key values, thereby reducing the calculated quantity and greatly increasing the searching speed. In a similarity matching algorithm of the high-dimension vector, the block distance is a frequently-used measurement way, the operation of the algorithm is simple, and the searching efficiency is higher, while most of the current index structures are provided based on Euclidean distance matching measurement. The index structureprovided by the invention not only supports searching based on the Euclidean distance matching way but also directly supports searching based on the block distance measurement way.

Description

technical field [0001] The invention belongs to the fields of data processing such as multimedia information retrieval, intelligent information processing, and data mining, and specifically relates to a high-dimensional vector fast retrieval algorithm based on block distance. Background technique [0002] With the development of computer and information technology, a large amount of multimedia data has been produced. How to quickly find the required information in the massive multimedia database is a key issue in the field of multimedia database research. The traditional method is to manually mark the multimedia data, and then realize the multimedia information retrieval through text retrieval. However, manual labeling has the disadvantages of heavy workload and strong subjectivity. For the explosive growth of multimedia data, manual labeling is impossible. Therefore, it is necessary to study content-based multimedia information retrieval technology. [0003] The technical ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 黄祥林杨丽芳吕锐吕慧
Owner COMMUNICATION UNIVERSITY OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products