Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for realizing two-dimensional predication selection rate estimation by using wavelet compressed histograms

A technology of selectivity and histogram, which is applied in the field of estimating the distribution of stored data, can solve problems such as optimization errors and corrected results deviating from actual results, and achieve the effects of reduced data loss, low storage and construction costs, and accurate estimation

Active Publication Date: 2008-01-16
北京神舟航天软件技术股份有限公司
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there is a method that hopes to correct the multidimensional selection rate obtained by using the one-dimensional histogram technique based on the assumption of independence by using the number of distinguishing values ​​​​of the multidimensional data, so as to obtain a more accurate multidimensional selection rate. The result of the correction is always a certain amplification of the multidimensional selection rate obtained by using the one-dimensional histogram technique based on the assumption of independence. Therefore, sometimes this method can achieve better results, but sometimes the result of the correction will be different. Deviates further from actual results, causing more serious optimization errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for realizing two-dimensional predication selection rate estimation by using wavelet compressed histograms
  • Method for realizing two-dimensional predication selection rate estimation by using wavelet compressed histograms

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] As shown in Fig. 1, the present invention is divided into two stages. The first stage is to perform statistics on the data in the database and store it as statistical information for future query optimization. The second stage is to estimate the selection rate during user queries.

[0037] The specific steps of the first stage are as follows:

[0038] Step 1: Data sampling

[0039] Sampling is to obtain a part of the sample from the whole so that this sample can describe the characteristics of the whole. Random sampling is performed on the relationship of the two-dimensional statistical information to be created, and the attribute value of the attribute involved in the two-dimensional statistical information is obtained, thereby forming the two-dimensional data set on which the statistical information is created.

[0040] Step 2: Extract the most frequent value MCV (Most Common Value)

[0041] First, fix a dimension order for the two-dimensional attributes of the statistica...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method of realizing estimation of two-dimensional predication selection rate through wavelet-based compressed column diagrams. The method is divided into two stages of data statistics and selection rate estimation in the database, wherein: the first stage includes the following steps: 1) data sampling; 2) extract the most frequent value; 3) build the data distribution matrix; 4) wavelet decomposition; and 5) filter wave storage; the second stage includes the following steps: 6) rebuild the data distribution matrix; and 7) selection rate estimation. The invention uses the wavelet technology to make loss compression on the original data distribution matrix, so as to make the combined distribution storage of two-dimensional data a reality. In actual use, the compressed data distribution matrix is restored to make the estimation of two-dimensional selection rate. Furthermore, the invention extracts the most frequent value for independent storage before the decomposition of wavelet, so the data loss from the compression by wavelet technology is greatly lowered. The invention is a method for swapping time and space. On the premise of not increasing the time consumption, the invention stores the combined distribution of two-dimensional data with less storage space, thus offering the accurate estimation of selection rate for two-dimensional inquiry.

Description

Technical field [0001] The present invention relates to a technique for estimating the distribution of stored data, and in particular to a method for implementing a two-dimensional predicate selection rate estimation using a wavelet-based compressed histogram. Background technique [0002] Many functions of the database require accurate estimation of the predicate selection rate, especially the query optimizer, which needs to use the selection rate of the predicate to estimate the cost, so as to select the plan with the lowest cost. [0003] Starting from the earliest relational database management system (RDBMS), query optimization has been a problem that plagued databases. The selection rate is usually used to estimate the number of result rows that meet the query conditions, and the predicate selection rate can usually be obtained from the histogram of statistical information. The statistical information of the database records information such as the number of rows, sizes, an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 李阳
Owner 北京神舟航天软件技术股份有限公司