Method and system for outputting text line content after document image checkbox state recognition

A state recognition and document image technology, applied in the field of image processing, can solve the problems of complexity, complex collection method, complex image content, etc., and achieve the effect of reducing the complexity and the difficulty of implementation.

Active Publication Date: 2020-01-07
BEIJING YIDAO BOSHI TECH
View PDF7 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 2. Complicated collection methods: With the popularization of collection devices such as mobile phones, tablets, cameras, scanners, and cameras, especially mobile phones, document image acquisition methods have shifted from traditional scanning methods to shooting methods. At present, more than 90% of documents The images are taken rather than scanned. Due to the complex background of the taken images, compared with scanners, they are not as good as scanners in various conditions such as

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for outputting text line content after document image checkbox state recognition
  • Method and system for outputting text line content after document image checkbox state recognition
  • Method and system for outputting text line content after document image checkbox state recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0061] A method for outputting the content of a text line after the status of a check box in a document image is recognized. The entire flow chart is as follows figure 2 shown.

[0062] Specific steps are introduced:

[0063] Module 1: Image Preprocessing

[0064] 1. Image preprocessing: The image size is adjusted to meet the boundary conditions of the convolutional neural network.

[0065] Module 2: Printed text line positioning recognition

[0066] 2. Use full convolution FCN to predict the text position and text angle of the input image, and finally average the main directions of the text in all predicted text areas to obtain the image direction angle. All printed characters or printed characters are the target of this positioning, to be The check mark in the printed text line is treated as a character, no matter whether it is selected or not (the following recognition is the same, first as a character recognition, although √ and × may have a large deviation and cause...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for outputting text line content after document image checkbox state recognition, and relates to the field of image processing. The method comprises the steps of inputting and preprocessing an original image is input and preprocessed; obtaining a direction angle of an original image, performing direction correction on the original image and a textbox according to the direction angle of the original image, outputting the original image and the textbox after direction correction, identifying text line content in the textbox after direction correction,and positioning a checkbox area through the textbox; carrying out boundary expansion on the textbox containing the checkbox area to obtain a mark search box, cutting out a mark search box image, and identifying the state of the checkbox according to the mark search box image; and correcting the text line content in the identified textbox based on the identified checkbox state, and outputting the corrected text line content. According to the method, the complexity of document recognition can be greatly reduced, and the engineering realization difficulty is greatly reduced.

Description

technical field [0001] The invention relates to the field of image processing, in particular to a method and a system for outputting the content of a text line after identifying the state of a check box in a document image. Background technique [0002] In all walks of life, there are still many paper documents that need to be saved, processed, retrieved, etc., especially in the financial field such as banking, securities, insurance, mutual funds, finance, taxation and other industries. In the past, the digitization of these paper documents was generally manually entered. With the continuous popularization of OCR technology, many industries have gradually adopted OCR recognition technology instead of manual entry, which has greatly improved work efficiency. There are a lot of check-box information in bank and insurance paper documents, which need to be filled in according to the fixed-format documents. However, the extraction of these checked information is currently a diff...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06K9/34G06K9/62G06N3/08
CPCG06N3/08G06V30/40G06V30/413G06V30/153G06V30/10G06F18/214
Inventor 朱军民王勇康铁钢
Owner BEIJING YIDAO BOSHI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products