Unlock instant, AI-driven research and patent intelligence for your innovation.

Systems and methods for digitized document image text contouring

a document image and contouring technology, applied in the field of systems and methods for digitized document image text contouring, can solve the problems of increasing processing time, reducing the quality of character recognition, and increasing the cost associated with i

Active Publication Date: 2021-07-22
CAPITAL ONE SERVICES
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patent describes a system and method for digitizing documents and extracting important information from them using optical character recognition. The system uses a processor and memory to capture and preprocess the document image, creating a set of contours and analyzing them to identify specific text or content. The system then generates clips from these contours and dynamically thresholds them to improve the accuracy of optical character recognition. The system can also receive and process textual output from the optical character recognition. Overall, this patent provides a technical solution for extracting information from digitized documents quickly and accurately.

Problems solved by technology

This can cause reduced quality of character recognition, increased processing time, and increased costs associated therewith.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for digitized document image text contouring
  • Systems and methods for digitized document image text contouring
  • Systems and methods for digitized document image text contouring

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0011]According to the various embodiments of the present disclosure, systems and methods are provided for digitized document image text contouring. Optical character recognition works well for documents that are laid out in a conventional book orientation and the image is of good quality. However, a shortcoming of OCR is that it converts an entire image in which case the text is not arranged in a storybook fashion (for example, left to right reading order, top down), and the resulting text becomes scrambled and difficult to parse. Static templates truncate data or include irrelevant data from other sections of the document image, thereby resulting in inaccurate output. As disclosed herein, rather than using an optical character recognition (OCR) engine to electronically read an entire document, only one or more portions of the document are selectively transmitted for OCR to account for printed data shifting in any direction, such as a collection of contours that are dynamically thr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Systems and methods for digitized document image text contouring are provided. One or more memories may be coupled to one or more processors, the one or more memories including instructions operable to be executed by the one or more processors. The one or more processors may be configured to receive a digitized document image. The one or more processors may be configured to preprocess the digitized document image to generate a plurality of contours. The one or more processors may be configured to adjust a plurality of bounding boxes of the plurality of contours; analyze the adjusted plurality of bounding boxes; create one or more clips based on the analysis; dynamically threshold the one or more clips; perform optical character recognition of the one or more clips; and receive output responsive to the optical character recognition.

Description

FIELD OF THE DISCLOSURE[0001]The present disclosure relates to systems and methods for digitized document image text contouring.BACKGROUND OF THE DISCLOSURE[0002]Current solutions for optical character recognition systems read an entire document or image without particular attention to accurately segmenting and identifying desired regions or portions of data of the document or image. This can cause reduced quality of character recognition, increased processing time, and increased costs associated therewith. These and other drawbacks exist.[0003]Accordingly, there is a need to accurately recognize text in a manner that efficiently uses system and other resources.SUMMARY OF THE DISCLOSURE[0004]In an exemplary embodiment, a document digitizing system may include one or more processors. The system may include one or more memories coupled to the one or more processors. The one or more memories may include instructions operable to be executed by the one or more processors. The one or more...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/34G06K9/00G06K9/32G06V10/44
CPCG06K9/344G06K9/00449G06K2209/01G06K9/00463G06K9/3275G06V30/412G06V30/414G06V20/62G06V10/44G06V30/153
Inventor SLATTERY, DOUGLAS
Owner CAPITAL ONE SERVICES