Data storage and query method based on matrix hash

A data storage and matrix technology, applied in the field of memory database, can solve the problems of high conflict rate, hash table conflict query time, low memory usage efficiency, etc., and achieve the effect of low conflict rate and easy realization

Active Publication Date: 2018-07-17
PEKING UNIV
View PDF4 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to solve the problem of hash table conflicts and query time, and overcome the defects of high collision rate, low memory usage efficiency, and low loading rate in existing hash tables, the present invention provides a multi-subtable hash and Bloom filter A new hash table design scheme combining , bitmap - "matrix hash"

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage and query method based on matrix hash
  • Data storage and query method based on matrix hash
  • Data storage and query method based on matrix hash

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The present invention will be further described below through specific embodiments and accompanying drawings.

[0026] 1. Data structure

[0027] The data structure of the "matrix hash" of the present invention comprehensively uses multi-level sub-tables, Bloom filters and bitmaps. The data structure consists of two parts: a hash table data structure and an auxiliary data structure.

[0028] 1. Hash table data structure

[0029] The size of each subtable, that is, the maximum number of elements that can be stored, is arithmetically decreasing, so the Bloom filter corresponding to the subtable is also arithmetically decreasing. A relatively simple balance strategy is used when inserting elements: whenever a new key-value pair is inserted, it is inserted into the sub-table with the smallest loading rate, so that the number of elements in each sub-table is also similar to arithmetic Exist in descending form.

[0030] Suppose there are a total of z sub-tables, and z is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data storage and query method based on matrix hashes. The method comprises the steps that 1), a hash table data structure is established, wherein the structure comprises z child tables, z is an even number, and the sizes of the child tables decrease progressively in equal difference; for 1<=i<= , the i child table is combined with the (z-i+1) child table to obtain child tables with the same size; 2), an auxiliary data structure is established, wherein the structure comprises z Bloom filters corresponding to the z child tables, and the sizes of the Bloom filters decrease progressively in equal difference; for 1<=i<= , the i Bloom filter is combined with the (z-i+1) Bloom filter to obtain Bloom filters with the same size, and the Bloom filtersare added together to form a multi-bit Bloom filter; 3), a key-value pair is inserted by using the hash table data structure and the auxiliary data structure to achieve data storage. By means of themethod, rapid update and rapid query can be achieved.

Description

technical field [0001] The invention belongs to the technical field of memory databases, in particular to a method for organizing, indexing and storing data based on a matrix hash algorithm. Background technique [0002] Compared with disk databases, in-memory databases have higher flexibility and ease of use. In-memory databases can be divided into relational in-memory databases and key-value in-memory databases in terms of paradigms. The key-value-based in-memory database (Key ValueStore) has the advantages of flexibility, simplicity, memory saving, and fast query. Compared with the relational in-memory database, it has unique advantages, so it is widely used in major Internet companies, such as Amazon and Facebook. , Youtube, Baidu, Sina, Sohu, etc. The data in the key-value storage system exists in the form of key-value pairs and is stored in a hash table. Therefore, the hash algorithm, as the core technology of the key-value storage system, is a key factor that directl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2255G06F16/2282
Inventor 杨仝张梦瑜李晓明
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products