Segmentation of a word bitmap into individual characters or glyphs during an OCR process
A character and glyph technology, which is used to divide word bitmaps into single characters or glyph fields in the OCR process, which can solve the problems of difficult word segmentation into single symbols, poor image quality, font thickness, italic text, and character shapes.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0016] figure 1 An illustrative example of a system 5 for performing optical character recognition (OCR) of text images is shown. System 5 includes a data capture device (eg, scanner 10 ) that generates an image of document 15 . Scanner 10 may be an imager-based scanner that utilizes a charge-coupled device as an image sensor to generate an image. Scanner 10 processes the image to generate input data and sends the input data to a processing device (eg, OCR engine 20 ) for use in recognizing characters within the image. In this particular example, OCR engine 20 is incorporated into scanner 10 . However, in other examples, the OCR engine 20 may be a separate unit such as a stand-alone unit or a unit incorporated in another device such as a PC, a server, or the like.
[0017] The OCR engine 20 receives the text image as a bitmap of lines of text. The image may be a scanned image of text or a digital document such as a PDF or Microsoft Word document where input data is already...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 