Method and device for pre-processing block Arab characters

A preprocessing device and printing technology, which are applied in the field of optical character recognition, can solve the problems of character recognition influence, characters are not easily recognized, affect the recognition effect, etc., and achieve the effect of improving the recognition effect.

Active Publication Date: 2012-05-16
HANVON CORP
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

like figure 1 As shown, if a character has a long elongated character, but the shape of the character inevitably has a certain deformation, which makes the character difficult to recognize, the existing method generally first breaks the elongated character by segmentation Open, and then identify the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for pre-processing block Arab characters
  • Method and device for pre-processing block Arab characters
  • Method and device for pre-processing block Arab characters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

[0052] The invention discloses a method and device for preprocessing Arabic characters in printed form. Preprocessing is performed on Arabic characters before recognition, a confidence frame is selected through center of gravity analysis, and character images are adjusted through the confidence frame, reducing the number of Arabic characters. The impact of the elongated character in the recognition core improves the recognition effect.

[0053] Such as figure 2 As shown, the Arabic characters have long elongated characters between characters due to typesetting and other reasons, and one of the blocks obtained after character segmentation is as follows image 3 block shown. Due to the presence of elongated characters in this block, the matching degree between its shape and the normal character "??" is poor. If the bl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for pre-processing block Arab characters, belonging to the field of optical character recognition. The method comprises the following steps of: step 1, performing a gravity centre analysis on the character images obtained after a segmentation, and calculating the deviation degrees of the gravity centres of the character images; if the deviation degrees of the gravity centres of the character images are less than an appointed threshold, then turning to step 3; or else, turning to step 2; step 2, adjusting a confidence frame according to the obtained deviation degrees of the gravity centres, and determining the character images in the confidence frame; and step 3, loading the character images in the confidence frame on a recognition core to recognize. In the invention, a confidence frame is selected via a gravity centre analysis before a character recognition is performed, and the character images are adjusted by the confidence frame, then the character images at the gravity centre parts in the confidence frame are recognized, so that the character images cannot be influenced by an improper selection for segmentation points while segmenting characters, thereby avoiding the influence of elongation characters in Arabic and improving the recognition effect of the character recognition.

Description

[0001] technical field [0002] The invention belongs to the field of optical character recognition, and relates to a character preprocessing method and device, in particular to a printed Arabic character preprocessing method and device. [0003] Background technique [0004] In character recognition, it is necessary to locate the character image from the original image, and then perform single character recognition based on the positioned coordinates. In printed Arabic, words are joined together at the baseline, and the length of the elongated character in some words in the line is often longer in typesetting to keep the text in each line intact. Such as figure 1 As shown, if a character has a long elongated character, but the shape of the character inevitably has a certain deformation, which makes the character difficult to recognize, the existing method generally first breaks the elongated character by segmentation Open, and then identify the obtained segmentation bloc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/36G06T7/60
Inventor 王琛刘正珍钮兴昱
Owner HANVON CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products