Unlock instant, AI-driven research and patent intelligence for your innovation.

Image recognition and information extraction method and device for standardized document

A document image, image recognition technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problems of easy failure of matching rules, extracting correct results, poor generality, etc., to achieve convenient use, improve accuracy, and improve The effect of recognition rate

Pending Publication Date: 2020-06-30
SHANGHAI HEHE INFORMATION TECH DEV +3
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method needs to rebuild rules for different types of documents, and has poor versatility; when keywords are identified incorrectly, matching rules are prone to failure; at the same time, it is difficult to extract correct results in skewed documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image recognition and information extraction method and device for standardized document
  • Image recognition and information extraction method and device for standardized document
  • Image recognition and information extraction method and device for standardized document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] see figure 1 , the method for image recognition and information extraction of standardized documents proposed by this application includes the following steps.

[0031] Step S10: Construct a template of the standardized document.

[0032] Step S20: matching the most suitable template for the standardized document to be recognized based on text and images, and performing perspective transformation on the image of the standardized document to be recognized.

[0033] Step S30: Perform offset correction for the text in the standardized document image to be recognized.

[0034] Step S40: Extract key field information from the standardized document image to be recognized.

[0035] Step S50: Perform post-processing on key fields to obtain a final output result.

[0036] The step S10 is to select a clear standardized document image for each type of standardized document and mark it first, and then use the marked information to generate a template file. For example, there ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image recognition and information extraction method for a standardized document. The standardized document is a document at a fixed format, and the method comprises the following steps: S10, constructing a template of the standardized document; S20, matching the to-be-identified standardized document with the most suitable template based on the characters and the images,and performing perspective transformation on the to-be-identified standardized document image; S30, carrying out offset correction on the characters in the standardized document image to be identified; S40, extracting key field information from the standardized document image to be identified; S50, performing post-processing on the key field to obtain a final output result. According to the method, template matching is realized by adopting a character and image matching technology, and the accuracy of image recognition and information extraction is improved by adopting perspective transformation, a character offset correction technology and a post-processing technology.

Description

technical field [0001] This application relates to an image processing and OCR (Optical Character Recognition, Optical Character Recognition) technology, referred to as image recognition technology for short, and in particular to a standardized document image recognition and key information extraction technology. Background technique [0002] At present, the image recognition and key information extraction of documents mainly adopt the method of keyword matching, and artificially construct rules to extract key information. This method needs to rebuild rules for different types of documents, and has poor versatility; when keywords are identified incorrectly, matching rules are prone to failure; at the same time, it is difficult to extract correct results in skewed documents. Contents of the invention [0003] The technical problem to be solved in this application is to propose a method for image recognition and information extraction of standardized documents based on templ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/62
CPCG06V30/1478G06V10/751
Inventor 段晗敏张彬李平新丁凯龙腾
Owner SHANGHAI HEHE INFORMATION TECH DEV