Method and system for code compression and decoding for word library
Patent Information
- Authority / Receiving Office
- CN ยท China
- Current Assignee / Owner
- GUANGDONG GUOBI TECH
- Publication Date
- 2009-09-02
- Estimated Expiration
- Not applicable ยท inactive patent
Smart Images
Figure 1 Figure 2 Figure 3
Abstract
Description
technical field
[0001] The invention relates to compression coding technology, in particular to a method and system for compressing coding and decoding thesaurus. Background technique
[0002] Most of the traditional compression codes for lexicons use Huffman coding. Huffman coding constructs a Huffman tree according to the number of occurrences of letters in a word. The higher the number of occurrences of letters, the shorter the length of the binary code assigned. , so that the average code length of all the words in the lexicon is as short as possible, but the compression rate of Huffman coding for words is not enough. According to statistics, the compression rate of the English lexicon using Huffman coding is 48.84 %, the compression rate of the Russian thesaurus is 48.64%, the compression rate of the Turkish thesaurus is 51.68%, the compression rate of the Arabic thesaurus is 56.50%, and the compression rate of the Portuguese thesaurus is 46.45%. It can be seen that th...