Method for searching high-dimensional vector combining clustering and city block distances

A technology of block distance and search method, applied in the field of data processing, can solve the problems of high query cost, large amount of calculation, loss of data information, etc., and achieve the effect of speeding up the query speed and reducing the number of

Inactive Publication Date: 2014-01-15
XINHUA NEWS AGENCY +1
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Pyramid technology, NB-tree, iDistance, iMinMax and other high-dimensional to one-dimensional conversion index structures achieve filtering and pruning through simple comparison of a single key value. Although complex distance calculations are not required and high retrieval efficiency is high, high-dimensional The process of converting to one-dimensional can cause a large amount of data information loss, resulting in different vectors may have the same one-dimensional key value, so that only a small proportion of data can be filtered out through a single key value, resulting in the operation of the final likelihood matching process The amount is still large, and the query overhead is still not small

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for searching high-dimensional vector combining clustering and city block distances
  • Method for searching high-dimensional vector combining clustering and city block distances
  • Method for searching high-dimensional vector combining clustering and city block distances

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] In order to make the technical problems and technical solutions to be solved by the present invention clearer and clearer, the specific implementation modes of the present invention will be further described below in conjunction with the accompanying drawings and implementation examples.

[0021] The flow chart of the index structure construction of a high-dimensional vector search method combining clustering and block distance provided by the implementation example of the present invention is as follows figure 1 As shown in (a):

[0022] First, the clustering algorithm is used to divide the high-dimensional vector set into spatial clusters to obtain the high-dimensional data of each cluster; secondly, the cluster center and radius of each cluster data are calculated, and a reference point is selected for each cluster data; again, each cluster is calculated one by one The block distance between each high-dimensional vector in the data and the reference point of the clus...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for searching a high-dimensional vector combining clustering and city block distances. In the method, an index structure CBlockB-tree combining clustering and the city block distances is provided; firstly, cluster partition is performed on high dimensional vector sets by adopting a clustering algorithm, and then BlockB-tree is constructed for each cluster data to form the CBlockB-tree. When the index structure performs searching, a part of cluster data which are disjointed with a query region can be filtered through clustering; the operation amount matched with final vector similarity can be further reduced by comparing Key values transformed from a high dimension to one dimension, and therefore the searching speed of the high-dimensional vector can be increased; meanwhile, the index structure is capable of effectively supporting the simple and efficient city block distances for matching searching.

Description

technical field [0001] The invention belongs to the fields of data processing such as multimedia information retrieval, intelligent information processing, and data mining, and specifically relates to a high-dimensional vector search method combining clustering and block distance. Background technique [0002] With the development of computer and information technology, a large amount of multimedia data has been produced. How to quickly find the required information in the massive multimedia database is a key issue in the field of multimedia database research. The traditional method is to manually mark the multimedia data, and then realize the multimedia information retrieval through text retrieval. However, manual labeling has the disadvantages of heavy workload and strong subjectivity. For the explosive growth of multimedia data, manual labeling is impossible. Therefore, it is necessary to study content-based multimedia information retrieval technology. [0003] The techn...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/435G06F16/41
Inventor 吕锐黄祥林陈明祥杨丽芳储达峰高庆魏海涛
Owner XINHUA NEWS AGENCY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products