High dimension data index method based on maximum clearance space mappings

A technology of spatial mapping and maximum gap, applied in the database field, which can solve problems such as low query processing efficiency

Inactive Publication Date: 2008-09-17
NORTHEASTERN UNIV
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although MB + -tree space division is disjoint, but due to MB + -tree does not use a single multi-dimensional index to deal with query problems in high-dimensional space, so the efficiency of query processing will be very low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High dimension data index method based on maximum clearance space mappings
  • High dimension data index method based on maximum clearance space mappings
  • High dimension data index method based on maximum clearance space mappings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0091] The present invention is further described in conjunction with accompanying drawing:

[0092] Figure 5 with 6 A plot of the test results for I / O overhead and false active subtree access is given when the lookup radius is 0.1, Figure 7 with 8 is the I / O overhead and false active subtree access test results when the lookup radius is 0.15. The data set selected for the experiment is generated as follows: select 20,000 real images, and use the MPEG-7 feature extraction tool to extract the Color Layout features of these images to form a high-dimensional data set. The feature dimension of Color Layout is 12. The experimental environment is a Pentinum IV 2.5GHz PC with a memory of 256MB. All data is stored in Fish, an object database system. In all experiments, the page size was set to 4096 bytes.

[0093] It is not difficult to see from these figures that as the number of dimensions increases, the number of I / O times and the number of false active subtree visits begin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a high dimensional data index method based on maximum clearance space mapping, and belongs to the database field, comprising following steps: a step 1 of processing the maximum clearance space mapping to calculate each dimensional clearance value of a given data space, and selecting values before K with larger dimensional clearance values, and projecting actual data points of the given data space into K dimensional spaces; a step 2 of manufacturing MS-tree Ms-tree, namely firstly finding a suitable knot insertion M, wherein, if the knot insertion is not full, the object is directly inserted into the knot insertion, and if the knot insertion is full, the knot insertion is broken up, then checking if the insert object in MBR of the knot insertion M or not, wherein, if not, then updating the MBR of the knot insertion M and mapping original space into a low dimensional space; a step 3 of processing a similarity query. The invention has an advantage of improving query performance via reducing visit of false activity subtree, so as to reduce visit times of the false activity subtree to improve the performance of index similarity query.

Description

technical field [0001] The invention belongs to the field of databases, in particular to a data indexing method, in particular to a high-dimensional data indexing method based on maximum gap space mapping. Background technique [0002] With the continuous growth of multimedia data sources in various application domains, it becomes more and more important to quickly process content-based similarity lookup in large databases. In order to speed up similarity search in high-dimensional space, a common method is to design a high-dimensional index to support this type of query. High-dimensional indexing methods can be divided into two categories: indexing structures based on vector spaces and indexing structures based on metric spaces. R-tree and its variants are representatives of the former, and they manage data based on relative positions in vector space. Other types of index structures, including VP-tree, MVP-tree, M-tree, MB + -tree, Slim-tree, and M+-tree are indexing tec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王国仁王波涛王斌赵相国乔百友韩东红于亚新赵宇海信俊昌张恩德
Owner NORTHEASTERN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products