Unlock instant, AI-driven research and patent intelligence for your innovation.

User correction of errors that occur in text documents that have undergone the optical character recognition (ocr) process

An optical character recognition and error technology, applied in the field of error correction, can solve time-consuming and other problems

Active Publication Date: 2011-12-21
MICROSOFT TECH LICENSING LLC
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Manual correction of each individual error can be a time-consuming and tiresome process on the part of the user

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • User correction of errors that occur in text documents that have undergone the optical character recognition (ocr) process
  • User correction of errors that occur in text documents that have undergone the optical character recognition (ocr) process
  • User correction of errors that occur in text documents that have undergone the optical character recognition (ocr) process

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012] figure 1 One illustrative example of a system 5 for performing optical character recognition (OCR) on images of text is shown. System 5 includes a data capture device (eg, scanner 10 ) that generates an image of document 15 . Scanner 10 may be an image-based scanner that utilizes charge-coupled devices as image sensors for generating images. Scanner 10 processes the image to generate input data, and sends the input data to a processing device, such as OCR engine 20, for character recognition within the image. In this particular example, OCR engine 20 is incorporated into scanner 10 . However, in other examples, the OCR engine 20 may be a separate unit, such as a standalone unit or a unit incorporated into another device, such as a PC, server or the like.

[0013] figure 2 is a high-level logic diagram of one particular example of OCR engine 20 . In this example, the OCR engine is configured as an application with the following components: an image capture componen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to user correction of errors that occur in text documents that undergo an Optical Character Recognition (OCR) process. An electronic model of an image document is created by going through the OCR process. The electronic model includes elements of the image document (eg, words, lines of text, paragraphs, images) that have been determined by each of the multiple sequentially performed stages in the OCR process. The electronic model serves as input information provided to each stage by a previous stage processing the image document. A graphical user interface is presented to the user to enable the user to provide user input data that corrects mischaracterized items that occur in the document. Based on the user input data, the initial error is corrected by the processing stage that produced the initial error that caused the mischaracterized term. The stages of the OCR process following this stage then correct subsequent errors made in their respective stages due to the initial error.

Description

technical field [0001] The present invention relates to an optical character recognition process, and more particularly to error correction in an optical character recognition process. Background technique [0002] Optical character recognition (OCR) is the computer-based conversion of images of text, typically in a standard encoding scheme, into digital form as machine-editable text. This process eliminates the need to manually type documents into computer systems. A number of different problems can arise due to poor image quality, non-idealities, etc. caused by the scanning process. For example, a conventional OCR engine may be coupled to a flatbed scanner that scans pages of text. Since the page is placed flush with the scanning surface of the scanner, images generated by the scanner typically exhibit flat contrast and illumination, reduced skew and distortion, and high resolution. Thus, the OCR engine can easily convert the text in the image into machine-editable text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/20G06F17/22
CPCG06K9/033G06V10/987
Inventor B·拉达科维奇M·武格代利亚N·托迪奇A·乌泽拉茨B·德雷舍维奇
Owner MICROSOFT TECH LICENSING LLC