Method and system for code compression and decoding for word library

A technology of compression encoding and thesaurus, applied in the field of compression encoding and decoding of thesaurus, can solve the problems of the average code length of words, insufficient word compression rate, large amount of calculation, etc., to achieve simple algorithm, improve compression rate, and small amount of calculation. Effect
CN101520771AInactive Publication Date: 2009-09-02GUANGDONG GUOBI TECH

Patent Information

Authority / Receiving Office
CN ยท China
Current Assignee / Owner
GUANGDONG GUOBI TECH
Publication Date
2009-09-02
Estimated Expiration
Not applicable ยท inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a method for code compression for a word library, which comprises the following steps that: A, a first frequency table is generated after various words in the word library are counted, and comprises a first letter frequency data group and a plurality of subsequent letter frequency data groups; B, each group of frequency data in the first frequency table is sorted according to the sequence, and frequency data of the same order location which groups of frequency data are positioned in are added to obtain a second frequency table comprising a plurality of sum frequencies; C, the sum frequencies are subjected to Huffman coding to obtain corresponding binary codes, and the obtained binary codes are allocated to the order location corresponding to each sum frequency in the second frequency table to generate a coding table; and D, letters of the words in the word library are substituted to generate binary coding corresponding to the words according to the binary codes corresponding to the order locations where first letters and various subsequent letters of each letter in the coding table are positioned. The invention also provides a system for code compression for the word library, and a method and a system for decoding codes in the word library. The invention improves the compression rate of word codes in the word library, and has simple decoding.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to compression coding technology, in particular to a method and system for compressing coding and decoding thesaurus. Background technique

[0002] Most of the traditional compression codes for lexicons use Huffman coding. Huffman coding constructs a Huffman tree according to the number of occurrences of letters in a word. The higher the number of occurrences of letters, the shorter the length of the binary code assigned. , so that the average code length of all the words in the lexicon is as short as possible, but the compression rate of Huffman coding for words is not enough. According to statistics, the compression rate of the English lexicon using Huffman coding is 48.84 %, the compression rate of the Russian thesaurus is 48.64%, the compression rate of the Turkish thesaurus is 51.68%, the compression rate of the Arabic thesaurus is 56.50%, and the compression rate of the Portuguese thesaurus is 46.45%. It can be seen that th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More