Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Column-storage-oriented Hash joint method for indexes in barrels

A connection method and column storage technology, which is applied in the field of column storage database management system environment based on binary tables, can solve the problems of memory waste, difficulty in choosing the number of buckets and hash functions, and difficult selection of hash functions, etc., to achieve improved Efficiency, increase search speed, and reduce the effect of search matching time

Inactive Publication Date: 2014-04-02
DONGHUA UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

One is that it is difficult to choose the appropriate number of buckets, and the other is that it is difficult to choose the appropriate hash function
Due to the "massive characteristics" of data in column storage analytical applications, it is more difficult to choose the number of buckets and hash functions, and the above problems are more obvious.
First of all, if the number of buckets is too small, the number of data in each bucket may be too much, resulting in low connection efficiency, and if the number of buckets is too large, it will cause memory waste and increase management costs
Secondly, since the characteristics of the data processed by each hash connection are different, it is difficult to find a general hash function so that the data can be hashed evenly, so it is also very difficult to choose a general hash function

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] In order to make the present invention more comprehensible, a preferred embodiment is described in detail as follows.

[0019] The present invention provides a column storage-oriented index hash join method in a bucket, the steps of which are as follows:

[0020] Step 1. Initialization: analyze the hash connection information of the two tables, determine the hash object table S, determine the hash key, initialize the hash table HT, set the number of buckets to B, and the hash function to f(x). The hash function is the hash function;

[0021] Step 2. Create a bucket node first, then use the hash function f(x) to calculate the hash value for the data Si in the hash object small table S, and then fill the data Si into the corresponding bucket in order according to the calculated value In the appropriate position of the node, if the data is unordered according to the hash key, the data in the bucket is stored in a linked list; if the data is in order according to the hash ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a column-storage-oriented Hash joint method for indexes in barrels. The column-storage-oriented Hash joint method is characterized by comprising a first step, initiating; a second step, filling data Si into proper positions of corresponding barrel joints according to values of the data; a third step, judging whether the number of elements in the barrels is larger than a tolerance value T or not, turning to a fourth step if the number of the elements is larger than the tolerance value T to build indexes in the barrels, scattering the elements into the barrel according to a common Hash algorithm if the number of elements is larger than the tolerance value T, and turning to a fifth step; the fourth step, building the indexes in the barrels; the fifth step, building an array of the indexes of the barrels; and a sixth step, realizing matched joint. The column-storage-oriented Hash joint method has the advantages that the indexes are built in the barrels, shortcomings of traditional Hash joint are overcome, matching finding time is shortened, and the efficiency of Hash joint is improved.

Description

technical field [0001] The invention relates to a column storage-oriented index hash join method in a bucket, which is suitable for a column storage database management system environment based on a binary table <row number, column value>. Background technique [0002] Join (Join) occupies a very heavy proportion in data query, especially in column storage data system. In addition to the connection operations required by user queries, in many cases, additional connections with the columns of the table are required in order to recombine the data in each column into rows. The classic join algorithms in the database include nested loop joins, merge joins, and hash joins. Among them, hash join is an efficient join algorithm, and its performance is better than traditional nested loop join and merge join in most cases. [0003] However, there are also some defects in the hash connection, mainly in two aspects. One is that it is difficult to choose an appropriate number of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 王梅乐嘉锦夏小玲郝大腾
Owner DONGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products