Discrete picture file information extraction system and method based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for information extraction and document information, applied in computer parts, instruments, character and pattern recognition, etc., can solve the problems of insufficient robustness, unrealistic, and difficulty in distinguishing printed and handwritten text, etc., to improve accuracy, reduce Loss of information, the effect of improving calculation speed and recognition effect

Inactive Publication Date: 2019-11-01

朱跃飞 +1

View PDF5 Cites 29 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0014] However, the FOTS model has some problems in extracting sparse text: 1. The algorithm mainly relies on CNN for detection, and the LSTM encoder is only used for recognition. It is difficult to distinguish printing and handwriting on the same line when the size of the training set is limited. Text; 2. The application scenario of FOTS is text detection and text recognition of natural pictures, rather than scanned documents. Using the EAST algorithm that can detect text at any angle, the accuracy of long text detection will decrease; 3. Due to the use of LSTM+ CTC, its recognition algorithm will decrease the recognition rate for noisy pictures, which is not robust enough

The existing OCR technology mainly has the following problems: 1. Users need different information for different texts, and a non-customized pervasive system cannot accurately extract the required information. For example, for employee information forms, users may need Extract the employer and employee's name, salary, position, etc. For invoices, the user may need to issue the invoice's unit and address, product name, price, and tax amount in advance

2. The current OCR technology does not have the ability to extract handwritten characters. The current handwritten font recognition based on deep learning can only deal with simple text, such as a line or a paragraph of dense handwritten text, but the text in the real environment often needs to deal with complex tables. or handwritten fill-in-the-blank signatures, etc., and handwritten information is generally important information that needs to be extracted. At present, there is no mature solution to locate, classify, and identify

3. In most current OCR algorithms, text detection and text recognition are separated, and the inaccuracy in detection and inaccuracy in recognition are superimposed, resulting in a decrease in the overall recognition rate

4. The current end-to-end technical solutions are limited to character extraction or ordinary handwriting recognition, which requires that the text provided by the user has been pre-processed, that is to say, what the user sends to the system is the image information that the user wants to extract. For example, the user If they want to import salary data from 5,000 scanned employee forms of different companies into the database, they need to extract part of the salary pictures from the 5,000 scanned pictures, and then send them to the OCR system, which is obviously very unrealistic

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0065] The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only examples and not intended to limit the present application.

[0066] An embodiment of the present invention provides a discrete image file information extraction system based on deep learning, such as figure 2 As shown, including: preprocessing module 1, end-to-end OCR (Optical Character Recognition) model 2 and information extraction model 3; wherein,

[0067] Preprocessing module 1, used to preprocess the input image;

[0068]The end-to-end OCR model 2 is used to input the preprocessed image into the first layer of the deep residual network (ResNet), and transfer it layer by layer to the first bottleneck module of the deep residual network, from the deep residual The first bottleneck module of the difference networ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a discrete picture file information extraction system and method based on deep learning, an end-to-end OCR model combines text detection with text recognition, the sharing of an image processing layer of a convolutional neural network is achieved, and the calculation speed and recognition effect are improved. Two-way LSTM is used in the aspect of character detection, a textarea is accurately segmented according to file layout when characters are recognized, and the problems that noise can be recognized as the characters through an existing OCR technology, messy codes can be generated for a chart, and wrong lines can be generated can be solved. In the aspect of text recognition, a ResNet is combined with a Transformer codec, so that written bodies and handwritten bodies can be accurately recognized; by combining an end-to-end OCR model with an information extraction model, information loss can be reduced, and the accuracy of information extraction can be improved; and the Bayesian optimization algorithm is used to realize training and parameter adjustment automation of the information extraction model, so that a user can customize the desired information extraction model even if the user does not understand machine learning.

Description

technical field [0001] The invention relates to the technical fields of artificial intelligence, optical recognition and machine reading lights, and in particular to a system and method for extracting discrete image file information based on deep learning. Background technique [0002] Optical Character Recognition (OCR) refers to the process of analyzing and recognizing image files of text materials to obtain text and layout information. That is to recognize the text in the image and return it in the form of text. [0003] Existing OCR technologies such as figure 1 As shown, it mainly includes image preprocessing, text detection and text recognition. Image preprocessing usually corrects the imaging problem of the image. Common preprocessing processes include: geometric transformation (perspective, distortion, rotation, etc.), distortion correction, blur removal, image enhancement, and light correction. Text detection is to detect the location, range and layout of text, a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06K9/32G06K9/62

CPCG06V30/413G06V20/62G06V30/10G06F18/214

Inventor 万波朱跃飞屈晓磊

Owner 朱跃飞

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Discrete picture file information extraction system and method based on deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology