Text enhancement of a textual image undergoing optical character recognition

An optical character recognition, text image technology, applied in character recognition, character and pattern recognition, instruments, etc., can solve problems such as OCR engine performance deterioration

Active Publication Date: 2012-11-14
ZHIGU HLDG
View PDF7 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Even if the performance of the scanning process is good, the performance of the OCR engine can deteriorate when relatively poor quality text pages are scanned

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text enhancement of a textual image undergoing optical character recognition
  • Text enhancement of a textual image undergoing optical character recognition
  • Text enhancement of a textual image undergoing optical character recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] figure 1 An illustrative example of a system 5 for optical character recognition (OCR) on an image, including a data capture device (eg, scanner 10 ) for generating an image of a document 15 is shown. Scanner 10 may be an image-based scanner that utilizes a charge-coupled device as an image sensor to generate an image. Scanner 10 processes the images to generate input data and transmits the input data to a processing device (eg, OCR engine 20 ) for character recognition within the images. In this particular example, OCR engine 20 is incorporated into scanner 10 . However, in other examples, the OCR engine 20 may be a separate unit, such as a stand-alone unit, or a unit incorporated into another device such as a PC, server, or the like.

[0032] The accuracy of the OCR process can be greatly improved if the background of the original image is detected and filtered out, while at the same time the remaining text pixels are enhanced. As detailed below, the background is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for enhancing a textual image for undergoing optical character recognition begins by receiving an image that includes native lines of text. A background line profile is determined which represents an average background intensity along the native lines in the image. Likewise, a foreground line profile is determined which represents an average foreground background intensity along the native lines in the image. The pixels in the image are assigned to either a background or foreground portion of the image based at least in part on the background line profile and the foreground line profile. The intensity of the pixels designated to the background portion of the image is adjusted to a maximum brightness so as to represent a portion of the image that does not include text.

Description

Background technique [0001] Optical character recognition (OCR) is the computer-based conversion of an image of text into a digital form such as machine-editable text, usually according to a standard encoding scheme. This process eliminates the need to manually type documents into a computer system. Many different problems can arise due to poor image quality, defects caused by the scanning process, and more. For example, a conventional OCR engine may be coupled to a flatbed scanner used to scan pages of text. Because the page is placed flush against the scanning surface of the scanner, images generated by the scanner typically exhibit uniform contrast and illumination, reduced distortion and distortion, and high resolution. Therefore, the OCR engine can easily convert the text in the image into machine-editable text. However, when an image has poor quality in terms of contrast, illumination, distortion, etc., the performance of the OCR engine may deteriorate and processing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/20G06K7/10G06V30/162G06V30/10
CPCG06K9/38G06K9/4638G06V2201/01G06V30/10G06V30/162G06V30/18076G06V10/28G06V10/457
Inventor S.加利奇D.尼耶姆塞维奇B.德雷塞维奇
Owner ZHIGU HLDG
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products