Orthogonal multilateral Hash mapping indexing method for improving massive data inquiring performance

A mass data and hash mapping technology, applied in the complex query field, can solve problems such as dependence, system resource consumption, and less data, and achieve the effect of improving performance and query efficiency

Inactive Publication Date: 2016-05-25
GUANGXI NORMAL UNIV
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the hash index also has its weaknesses, that is, the number of buckets in the hash map and the amount of data in each bucket depend on the selection of the hash function
In the big data scenario, there is a clear skew phenomenon in the distrib

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Orthogonal multilateral Hash mapping indexing method for improving massive data inquiring performance
  • Orthogonal multilateral Hash mapping indexing method for improving massive data inquiring performance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0020] see figure 1 figure 2 , an orthogonal multi-hash map indexing method for improving the query performance of massive data, comprising the following steps:

[0021] 1) Do the first layer of hash mapping on the query attributes of massive data. This method is applicable to various types of data sources, such as relational databases, data files in the file system, and massive data stored in the format of key-value pairs. All data records are assigned to specific hash buckets after the first layer of hash function mapping;

[0022] 2) Build a B+ tree on the value space of the first layer hash, figure 2 It is an m-degree B+ tree, the original linear search time complexity is O(n), optimized to tree search, the tree search time complexity is O(logn), and the search for hash map values ​​is optimized;

[0023] 3) Hash-map the first-level hash buckets again through the second-level hash function, that is, divide the first-level hash buckets again to reduce the data capacity...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an orthogonal multilateral Hash mapping indexing method for improving massive data inquiring performance. The method comprises the following steps that firstly, Hash mapping is carried out on inquiring attributes of data records; secondly, a B+ tree is established on a first layer Hash value space, and linear searching is carried out on Hash mapping values; thirdly, Hash mapping is carried out on a first layer Hash barrel again through a second layer Hash function so as to reduce the data capacity in the Hash barrel; fourthly, pointer connection is established between the two adjacent layer Hash barrels, partitioning is stopped when the number of the Hush barrels exceeds a specified early-warning threshold value, and orthogonal multilateral Hash mapping indexing is completed. By means of the method, various data formats are compatible, physical storage of data is not changed, orthogonal multilateral Hash optimizing strategies are provided for serious imbalance and overflowing phenomena of the Hash barrels existing in orthogonal Hash indexing, and thus the complex inquiring performance for big data and the inquiring efficiency of the data are improved.

Description

technical field [0001] The invention relates to complex query technologies such as non-primary key query, multi-condition query, fuzzy query, and multi-keyword joint query on big data, and specifically relates to an orthogonal multi-hash mapping index method that improves the query performance of massive data. Background technique [0002] Querying on big data is a challenging job. Since Google proposed the troika of big data processing in 2006: GFS file system, MapReduce parallel computing framework, and BigTable data storage, the storage and processing of data has gradually transitioned from the traditional relational model-based relational database to key-value pairs. An unstructured data storage system for data models. The key-value pair data storage built on the distributed file system provides scalability and fault tolerance unmatched by relational databases through the distributed file system and multi-data backup. In theory, a data cluster center built on a general...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2255G06F16/2246
Inventor 葛微李先贤王利娥
Owner GUANGXI NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products