Generic character encoding and decoding method and system
An encoding algorithm and encoding technology, which is applied in the field of encoding and decoding, can solve the problems of small custom space and cannot meet the needs of custom mixed binary, and achieve the effects of strong versatility, good space efficiency, and space saving
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0023] Such as figure 1 As shown, the implementation of the present invention provides a pan-text character encoding method, including:
[0024] 100. According to the character code point, decompose the area code, language size, and font size. details as follows:
[0025] 101. If the character to be encoded is ASCII, it is classified into the single-byte area, and the code element is the ASCII value. For example, the binary form of "A" is 01000001, and it is output as it is.
[0026] 102. If the character to be encoded is a common character, it is classified into the double-byte area, and the font size corresponds to the code point one by one. The font size is inserted into the space of 1xxxxxxx 0xxxxxxx to obtain the code element. For example, "的" is a commonly used Chinese character, the font size is 0, and the code is 1010000000000000.
[0027] 103. If the character to be coded is a rare word, it is classified into a three-byte area, and the font size is the code point, which co...
Embodiment 2
[0033] Such as figure 2 As shown, the implementation of the present invention provides a pan-text character decoding method, including:
[0034] 200. In the coded byte sequence, the first bit of the byte is 0 as the end byte of the symbol, and the symbol is divided by this. Do different processing according to the length of the code element. For example, 01000001 10100000 00000000 10000010 1100000000100111 1000000010000000 10000010 00000010 can be divided into 01000001, 10100000 00000000, 10000010 11000000 00100111, 10000000 10000000 1000001000000010 according to the first 0 of the tail byte, and the length is 1, 2, 3, and 4 respectively.
[0035] 201. If the code element is a byte, it is a single-byte area, and its value is the ASCII value. For example, 01000001 is A.
[0036] 202. If the code element is two bytes, it is a double-byte area, and its value is mapped to the character code point one by one, like 1xxxxxxx, where the x part is the data bit, which is bijected to the cha...
Embodiment 3
[0045] Such as image 3 As shown, the implementation of the present invention provides a pan-text character encoding system, including:
[0046] 301. The decomposition module, including a decomposer, uses the decomposer to decompose the area code, symbol, and font size of the character sequence or binary sequence to be encoded verbatim;
[0047] 302. Synthesis module, including single-byte synthesizer, double-byte synthesizer, three-byte synthesizer, four-section three-word synthesizer, four-section double-word synthesizer, and four-section binary synthesizer. Synthesize code elements according to the area code, language number, and font size. If the area code and language number of the preceding and following words are the same, and there is a space in the previous code element, the next font size can be pressed into the space of the previous symbol. The byte at the end of the code element uses the first bit 0, and the remaining first bit 1 is used as a separator, which is spliced...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com