Certificate image text recognition method and system based on deep learning

A deep learning and text recognition technology, applied in the field of document image recognition, can solve problems such as irregular text distribution, large text background noise, and unsatisfactory detection rate of OCR technology.

Inactive Publication Date: 2019-10-22
JINAN INSPUR HIGH TECH TECH DEV CO LTD
View PDF4 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are problems such as large text background noise, irregular text distribution, and the influence of natural light sources in natural scenes. The detection rate of OCR technology in actual natural scenes is not ideal.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Certificate image text recognition method and system based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] A kind of document image text recognition method based on deep learning of the present invention, comprises the following steps:

[0059] S100. Perform preprocessing on the document image to remove noise, and obtain a preprocessed image;

[0060] S200. Perform text detection on the preprocessed image based on the CTPN algorithm to obtain the text area of ​​the document image;

[0061] S300. The relative position of the font in the document image is fixed, and an image position template is made based on the above principles, and the text area of ​​the document image is screened through the image position template to obtain a target text area of ​​the document image;

[0062] S400. Reconstructing the VGG16 model based on the category of Chinese characters to obtain a text recognition model, using the target text area of ​​the document image as input, and using the TensorFlow Slim algorithm to train the text recognition model to obtain a trained text recognition model;

...

Embodiment 2

[0080] The document image text recognition system based on deep learning of the present invention includes a preprocessing module, a text detection module, a text area module, a model training module and a testing module. The processed image; the text detection module is used to perform text detection on the preprocessed image based on the CTPN algorithm, and output the text area of ​​the document image; the text area module is used to make an image position template based on the principle that the relative position of the font in the document image is fixed, and Filter the text area of ​​the document image through the image position template, and output the target text area of ​​the document image; the model training module is used to reconstruct the VGG16 model based on the category of Chinese characters to obtain a text recognition model, and take the target text area of ​​the document image as input, through The TensorFlow Slim algorithm trains the text recognition model an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a certificate image text recognition method and system based on deep learning, belongs to the field of certificate image recognition, and aims to solve the technical problem ofhow to realize effective recognition of texts in a certificate image in a natural scene. The method comprises the following steps: preprocessing a certificate image to remove noise to obtain a preprocessed image; performing text detection on the preprocessed image based on a CTPN algorithm to obtain a text area of the certificate image; screening the text area of the certificate image through theimage position template to obtain a target text area of the certificate image; taking the target text area of the certificate image as input, and training the character recognition model through a TensorFlow Slim algorithm to obtain a trained character recognition model; and recognizing the font to be recognized through the trained character recognition model. The system comprises a preprocessingmodule, a text detection module, a text region module, a model training module and a test module.

Description

technical field [0001] The invention relates to the field of document image recognition, in particular to a method and system for document image text recognition based on deep learning. Background technique [0002] With the rise of artificial intelligence, image recognition technology is gradually applied to security, military, medical, intelligent transportation and other fields, and technologies such as face recognition and fingerprint recognition are increasingly used in public security, finance, aerospace and other security fields. In the military field, image recognition is mainly used in the detection and identification of targets, through automatic image recognition technology to identify and strike enemy targets; in the medical field, image recognition technology can be used for various medical image analysis and diagnosis, On the one hand, it can greatly reduce the cost of medical care, and on the other hand, it can also help to improve the quality and efficiency o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/66G06K9/68G06N3/04G06N3/08
CPCG06N3/08G06V20/63G06V30/1478G06V30/194G06V30/248G06V30/10G06N3/044
Inventor 尹青山李锐于治楼王相成宗云兵
Owner JINAN INSPUR HIGH TECH TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products