Check patentability & draft patents in minutes with Patsnap Eureka AI!

Chinese character automatic verification and error correction system and method for gbk encoding

An error correction method and automatic verification technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of small error expansion in text transmission, text scrambling, and poor fault tolerance of GBK encoding, etc., to overcome the avalanche effect of effect

Active Publication Date: 2016-03-16
SHANGHAI ZHANGMEN TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, GBK coding has the disadvantage of poor error tolerance. When a single byte is omitted in the coding, the coding of all subsequent Chinese characters will be wrong. After the first byte "CD" is lost, the system will still display the subsequent code with every two bytes as a Chinese character. In this way, the above phrase is displayed as "蚋絞浞zhi派田吉", the phrase All the characters in the text are scrambled, this phenomenon is called the "avalanche effect" of Chinese encoding
The above problems often occur when texts are transmitted on the Chinese Internet, which will greatly enlarge the small errors in text transmissions, resulting in the inability to read all texts correctly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese character automatic verification and error correction system and method for gbk encoding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to have a more specific understanding of the technical content, characteristics and effects of the present invention, now in conjunction with the illustrated embodiment, the details are as follows:

[0026] Such as figure 1 Shown, the Chinese character automatic verification and error correction system of the embodiment of the present invention includes:

[0027] An encoding anomaly detection module, which is used to identify whether there is an encoding anomaly in the GBK-encoded Chinese character string input into the Chinese character automatic verification and error correction system;

[0028] The error correction attempt module is used to reassemble the high and low bytes of the GBK encoding of the Chinese character string identified by the encoding anomaly detection module as having an encoding anomaly, and pass the combined attempt identification result to the error correction discrimination module;

[0029] The error correction judgment module is used ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese character automatic checking and error-correcting system aiming at GBK (Chinese Internal Code Specification) encoding, which comprises an encoding abnormal detection module, an error correction trying module and an error correction judging module. The invention also discloses a Chinese character automatic checking and error-correcting method implemented on the basis of the system. Abnormal GBK encoding can be identified and corrected and correct text characters can be recovered by the system and the method thereof. When the system and the method thereof are applied, firstly, the texts which are abnormally encoded are identified by the encoding abnormal detection module; then various error correction schemes are tried on the texts by the error correction trying module; and finally, an optimal error correction scheme is selected by the error correction judging module and the texts are recovered into texts which are normally encoded, so that the text characters can be correctly displayed.

Description

technical field [0001] The invention relates to an automatic check and error correction system for Chinese characters suitable for GBK encoding. The invention also relates to the application method of the system. Background technique [0002] GBK, the full name of "Chinese Character Internal Code Extension Specification", is a Chinese character encoding standard formulated by the National Information Technology Standardization Technical Committee. It is an internal code extension specification based on the GB2312 standard, so it is fully compatible with the GB2312 standard. [0003] The GBK encoding system contains a total of 21003 Chinese characters, and each Chinese character is represented by 2 bytes. The range of the first byte is 0x81-0xFE, and the range of the second byte is 0x40-0xFE. For example, the GBK code corresponding to the phrase "Wan Gang fully affirms science and technology" is expressed in hexadecimal as "CDF2B8D6B3E4B7D6BFCFB6A8BFC6BCBC", where "CDF2" cor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/22G06F17/27
Inventor 陈运文
Owner SHANGHAI ZHANGMEN TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More