Unlock instant, AI-driven research and patent intelligence for your innovation.

Perfect hash function construction method and system for dictionary with random scale

A construction method and dictionary technology, applied in the field of perfect hash function construction, can solve problems such as too long association table, string construction hash function, low filling factor, etc., to reduce space complexity, increase space utilization, memory The effect of taking up little space

Inactive Publication Date: 2010-08-18
北京金远见电脑技术有限公司
View PDF1 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since its prime number generator can only generate about 40 prime numbers, and it cannot construct a hash function for strings, its method is very limited and basically only has mathematical meaning
The Mincycle algorithm proposed by Sager can in principle handle large-scale dictionaries, but in some cases it may not be effective for small dictionaries.
The famous perfect hash function generator gperf under the Linux system cannot guarantee that the perfect hash function of the dictionary can be found every time it runs, and the fill factor will be very low when the number of entries in the dictionary is too large (more than 10,000)
In related technologies, the construction of a perfect hash function mainly depends on the smoothing function construction of the corresponding dictionary, but it is difficult to select the smoothing function coefficients of the corresponding dictionary, and when there are too many word entries, the association table will be too long and the filling factor will be very large. low level problem
It can be seen that constructing a perfect hash function for a dictionary is a very difficult job

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Perfect hash function construction method and system for dictionary with random scale
  • Perfect hash function construction method and system for dictionary with random scale
  • Perfect hash function construction method and system for dictionary with random scale

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Exemplary embodiments of the present invention will be specifically described below with reference to the accompanying drawings.

[0030] In order to make the present invention more understandable, some following definitions and illustrations are provided:

[0031] Definition 1: In the present invention, a position in the hash table is visually represented as a rectangle, which has two states, namely free and occupied:

[0032] Idle (idle): Indicates that this spatial position has not yet been mapped to a word in the dictionary;

[0033] Occupied (occupied): Indicates that this spatial position has a mapping relationship with a certain word in the dictionary.

[0034] Definition 2: Item: It can be represented visually by a block version, and its visual description is as follows figure 1 As shown, it is divided into free and occupied units, which correspond to the free and occupied space in the hash table, that is, whether there is a mapping relationship with a certain...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a perfect hash function construction method for a dictionary with a random scale, which comprises the following steps of: (1) constructing a TRIE tree according to the dictionary or a key word set; (2) determining the correlation length k of words; (3) by beginning from the k layer of the TRIE tree to a zero layer, calculating the correlation value of each character forming the dictionary on each layer from bottom to top layer by layer; and (4) realizing the perfect mapping of the dictionary to a hash table according to the calculated correlation length of the words and the correlation value of each character on each layer.

Description

technical field [0001] The invention relates to a comparison and retrieval of words and keywords in a dictionary, and more particularly relates to a perfect hash function construction method and system for dictionaries of any scale. Background technique [0002] Dictionary indexing and searching are the most basic problems in natural language processing. The judgment of keywords and predefined identifiers in the compiler, the coloring of keywords in the IDE, the spelling check in the editor, the positioning of keywords in the search engine Postlist, the judgment of stop words, and Chinese word segmentation are widely used in fields such as . [0003] The study of dictionary indexing and lookup has a long history. Commonly used indexing methods include linear table binary (half) search, various search trees and hashing methods, etc. Since the linear table adopts binary search, the table needs to be an ordered table, and the search efficiency is related to the length of the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 王晓春王亚军
Owner 北京金远见电脑技术有限公司