Chinese character encoding method

A Chinese character encoding and Chinese character technology, which is applied in the field of computer text information processing, can solve the problems of occupying computer memory resources, occupying resources, and slow retrieval speed of Chinese characters.

Inactive Publication Date: 2011-11-16
SIYANG TIANQIN SOFTWARE TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The fundamental disadvantage of the above encoding is that it takes up a lot of computer memory resources and lacks scalability. If you add newly discovered ancient characters or other Chinese characters, the current encoding cannot cope with these newly a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] Select more than 3,000 Chinese characters from Level 1-2 Chinese characters, and then select about 1,000 characters from the current total amount of Chinese characters, and sort them according to their Chinese pinyin. In principle, each character corresponds to 30-100 Chinese characters. On the basis of GB8213, encode each character.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese character encoding method. According to the encoding method disclosed by the invention, the encoding of Chinese characters is fundamentally improved on the basis of GB (GuoBiao: Chinese Standard) 2312-80, and the characteristics of the Chinese characters are utilized. The encoding method is characterized in that a 4-bit hexadecimal system is adopted for all the Chinese characters including the Chinese characters in Japanese and Korean as well as Chinese symbols; and an encoding space ranges from 1000 to FFFF and is specifically characterized in that the encoding space for character non-formation components in Chinese characters ranges from 1000 to 1999, the encoding space for first-level Chinese characters and partial second-level Chinese characters in the Chinese standard ranges from 2000 to EFFF, and the encoding space ranging from F000 to FFFF is vacant and is an expansion encoding region. The principle of the encoding method can meet the requirements for encoding millions of Chinese characters and can be used for encoding the Chinese characters in the formation manners, such as 'character non-formation component (radical)+Chinese character', or 'Chinese character+Chinese character', or 'character non-formation component (radical)+character non-formation component (radical)' and the like, and the codes of each Chinese character are respectively taken to encode the Chinese character. The Chinese character encoding method is simple and rapid, has the advantages of saving memory resources of a computer and improving the retrieval efficiency of the Chinese characters, has giant advantages compared with the traditional unicode encoding, and provides a theoretical basis for setting up an international standard.

Description

technical field [0001] The invention belongs to computer word information processing Background technique [0002] The Chinese character encoding of the computer has had a tortuous development history. The earliest adopted GB 2312-80 standard, because the number of Chinese characters is tens of thousands or even hundreds of thousands, the encoding of this standard cannot meet its needs, so there are GBK, and even ISO10646 / Unicode Standard. The following briefly describes the content of these standards: [0003] GB2312 has 6763 Chinese characters, including all the first-level Chinese characters and common parts of the second-level Chinese characters. 2 The first-level Chinese characters (Chinese characters in the 16-55 area) are arranged in the order of pinyin letters, and the homophones are arranged in the order of strokes, horizontal, Vertical, skimming, pressing down, folding are the order, the same starting strokes are pressed the second stroke, and so on; second-leve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/22
Inventor 潘文林
Owner SIYANG TIANQIN SOFTWARE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products