A method for complex text recognition based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A text recognition and deep learning technology, applied in the field of image recognition, can solve a large number of problems such as human labeling and loss, and achieve the effect of solving information loss

Active Publication Date: 2019-01-18

成都数联铭品科技有限公司

View PDF3 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

By analyzing the reasons for the complexity of the text, a random sample generator is designed to automatically generate a large number of training samples containing various noises and distortion feature ranges that can be used by the deep neural network, which solves the problem of using the deep neural network to recognize text in the prior art. The problem of requiring a large amount of manpower to label significantly saves manpower costs; the present invention also uses the most advanced deep neural network classifier to automate the identification of pictures on the premise that the training set retains the complexity of the original picture such as noise and distortion learning, which solves the problem of information loss caused by image and text recognition that requires denoising in the prior art, and improves the accuracy of recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0086] like Figure 8 As shown, first prepare a sample set of the same type as the picture to be recognized, for example, select 500 pictures with such Figure 9 The noise of the picture to be recognized and the sample pictures with similar fonts are manually marked, and 150 samples are selected as the development set, and the other 350 samples are used as the first training sample set; the strings in the pictures are segmented out , and split the string into sub-pictures that only include a single character, analyze the font of the picture to be recognized in the first training sample, and select the closest font: Times New Roman, then choose Times New Roman as the random sample generator Basic font library; if the characters contained in the picture with recognition are only numbers, you need to choose the Times New Roman number set as the basis for sample generation; according to the noise and distortion features contained in the artificially labeled samples (with Figure ...

Embodiment 2

[0089] like Figure 11 As shown in the process of , when the character string has obvious characteristics of a certain language model, the recognition result of the deep neural network in the step (2-5) is optimized through the language model, and finally the language model optimized result is output recognition result. For example, the target image to be recognized is Figure 12 As shown, the character string identified by the deep neural network is "Zhang San (the probability of "eating" is 50%, and the probability of "steam" is 50%) rice" wherein "Zhang San" and "rice" are identified Out of the probability of 100%, in this case according to the language structure model of the subject-verb-object in the speech model, the subject "Zhang San" and the object "rice" have been determined based on the middle character as the predicate verb "eat". The probability should be the largest, and "qi" is obviously impossible to appear in the position of the predicate verb as a noun, so ...

Embodiment 3

[0092] When the string to be recognized matches a specific language template, such as Figure 13 As shown, some language templates can be used to optimize the recognition results of the neural network. For example, the recognition results of picture 13 are "foolish", "valley", "moving" and "mountain"; the first, third and fourth characters are respectively recognized When the probability of being "foolish", "moving" and "mountain" is the highest (for example, 80%), the probability of recognizing the second character as "valley" is 60%. At this time, according to the fixed language template of the idiom, the The final result of the recognition is corrected to "Yugong Yishan"; this kind of recognition result is more in line with the correct language habits, and the recognition result is more accurate and reasonable.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of image recognition, in particular to a complex character recognition method based on deep learning. By analyzing the reasons for the complexity of the text, use the training samples generated by the random sample generator containing the image noise model and the distortion feature model to be recognized to train the deep neural network; such training samples contain complex noise and distortion deformation, which can meet various requirements. The need for a complex text recognition; a small amount of artificially labeled first training sample set and a large number of randomly generated second training sample sets are mixed and input into the deep neural network, which solves the need for a large amount of manual labor when recognizing characters through the deep neural network. The problem of labeling training samples; and under the premise of retaining the complexity of the noise and distortion of the image to be recognized, using the most advanced deep neural network for automatic learning, avoiding the information loss caused by the need for denoising in the existing OCR method problem, which improves the recognition accuracy.

Description

technical field [0001] The invention relates to the field of image recognition, in particular to a complex character recognition method based on deep learning. Background technique [0002] Image recognition is of great significance in the field of intelligent recognition. With the advancement of technology and the development of society, the demand for automatic recognition of text in pictures is also increasing rapidly. Traditional Optical Character Recognition (OCR) systems are often used to identify documents scanned using optical devices, such as digitized ancient books, business cards, invoices, forms, etc. Usually this type of scanned document has a relatively high resolution and contrast, and the printed fonts are generally relatively single and regular, making it easier to extract a single text for recognition. Therefore, the core of this type of document recognition is to eliminate noise. There are many ways to eliminate noise: for example, use Gaussian for smooth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/62G06N3/08

CPCG06N3/088G06V30/10G06F18/214

Inventor 刘世林何宏靖吴雨浓

Owner 成都数联铭品科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A method for complex text recognition based on deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology