Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method and system for detecting garbled characters in a text document

A text document, garbled code detection technology, applied in instruments, computing, electrical digital data processing and other directions, can solve the problem of not being able to determine the real cause, to achieve the effect of improving the speed and reducing the scope

Inactive Publication Date: 2018-05-25
NEW FOUNDER HLDG DEV LLC +1
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is that in the prior art, garbled codes are detected only by judging the encoding format, but there are many reasons for garbled codes, and sometimes the real cause of garbled codes cannot be determined, so a comprehensive consideration of garbled codes is provided. A method and system for detecting garbled characters in a text document due to various reasons

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for detecting garbled characters in a text document
  • A method and system for detecting garbled characters in a text document
  • A method and system for detecting garbled characters in a text document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] The flowchart of the text document garbled code detection method described in this embodiment is as follows figure 1 shown, including the following steps:

[0053] A step of establishing a first coding range library, the first coding range library includes the coding ranges of all regular characters in the coding format of the detected text document characters.

[0054] In the sampling step, codes corresponding to M characters are selected from the detected text document, where M is an integer greater than or equal to 1.

[0055] In the first comparison step, the codes corresponding to the M characters selected in the sampling step are compared with the codes in the first code range library respectively, and the same characters will be obtained in the first code range library. Characters corresponding to the codes of the result are determined as non-garbled characters; characters corresponding to codes that cannot obtain the same result in the first code range library ...

Embodiment 2

[0062] On the basis of Embodiment 1, the text document garbled detection method described in this embodiment, such as figure 2 As shown, the following steps are also included: the step of establishing a second coding range library, the second coding range library includes the coding ranges of all characters in all existing coding formats. In the second comparison step, the codes corresponding to the characters that are determined to be garbled characters by the first comparison step are compared with the codes in the second code range library respectively, if determined by the first comparison step If the code corresponding to the character determined to be garbled codes obtains the same result in the second code range library, the character corresponding to the code is restored to be a non-garbled code. If a code with the same result cannot be obtained in the second code range library, it is determined that the character corresponding to the code is a garbled character.

[...

Embodiment 3

[0069] The text document garbled character detection system described in this embodiment includes: a sampling module 1, configured to select codes corresponding to M characters from the detected text document, where M is an integer greater than or equal to 1. The first coding range library 2 is used to store the coding ranges of all normal characters in the coding format of the detected text document characters. The first comparison module 3 is used to compare the codes corresponding to the M characters selected by the sampling module 1 with the codes in the first code range library 2 respectively, and compare the codes in the first code range Characters corresponding to codes that obtain the same result in library 2 are determined as non-garbled characters; characters corresponding to codes that cannot obtain the same result in the first code range library 2 are determined as garbled characters.

[0070] In this embodiment, through the joint action of the sampling module 1, t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and system for detecting garbled characters in text documents. When the coding format is correct, by establishing a first coding range database containing the coding ranges of all regular characters in the coding format of the characters of the text document to be detected and from the detected text The document selects the codes corresponding to M characters, and compares the codes corresponding to the characters with the codes in the first code range library to determine whether the characters are really garbled characters, and then determine whether the text document is garbled characters. Compared with the prior art, which judges whether a character is garbled only by judging whether the encoding format is correct, the present invention provides a text document garbled detection method that comprehensively considers multiple reasons to judge whether a character is really garbled, and thus can determine Whether the characters are really garbled characters, and then process the garbled characters according to the real cause of the garbled characters, which improves the user experience.

Description

technical field [0001] The invention relates to a method and system for detecting garbled characters, in particular to a method and system for detecting garbled characters in text documents, and belongs to the related field of word processing. Background technique [0002] Garbled characters mean that the terminal device cannot display correct characters, but displays other meaningless characters or blanks. In fields such as graphical user interface and word processing of terminal devices, garbled text documents often appear, making users unable to understand the content described in the text documents, thereby limiting the next step of the user's operation. Therefore, it is necessary to detect garbled characters in the content of the text document currently in use, find out the garbled characters in the text document and determine the real cause of the garbled characters, and effectively correct the garbled characters according to the real cause of the garbled characters, s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/22
Inventor 张鹏李睿马静山
Owner NEW FOUNDER HLDG DEV LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products