Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Attention mechanism text recognition method based on deep learning

A text recognition and deep learning technology, applied in the field of deep learning text recognition, can solve problems such as average text recognition accuracy, RNN gradient explosion, difficult backpropagation, etc.

Pending Publication Date: 2020-08-18
FOSHAN NANHAI GUANGDONG TECH UNIV CNC EQUIP COOP INNOVATION INST +1
View PDF1 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The traditional method is to use the neural OCR technology based on the CTC model. The RNN used in the encoding and decoding structure can handle certain short-term dependencies, but it cannot handle long-term dependencies, because when the sequence is long, the gradient at the back of the sequence is very large. It is difficult to backpropagate to the previous sequence. Similarly, RNN may also have a gradient explosion problem. Its model is relatively accurate for text recognition that is more complex (such as complex formula symbols)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention mechanism text recognition method based on deep learning
  • Attention mechanism text recognition method based on deep learning
  • Attention mechanism text recognition method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0064] As a preferred embodiment of the present invention, the step S2 specifically includes:

[0065] S21. Utilize the Python script to crop the blank area of ​​the formula picture in the data set, such as Image 6 As shown, the formula in the blank paper is used to detect white and most white unimportant areas, and extract important pixels;

[0066] S22. Insert null characters into each formula mark in the data set IM2LATEX-100K for indexing, and then generate the data set IM_2_LATEX-100K;

[0067] S23. Remove about 1 / 4 of the image index corresponding to the oversized formula image from the data set IM_2_LATEX-100K, and then generate the bag-of-words text file latex.t of the latex code.

[0068] As a preferred implementation of the present invention, in said step S3, CNN comprises 6 layers, and the first layer outputs 512 features, mainly because the word bag file latex.txt includes 499 (the first layer output must be greater than this number, Otherwise, all elements cann...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an attention mechanism text recognition method based on deep learning, wherein the method comprises the following steps: S1, acquiring a model training data set; S2, preprocessing the model training data set; S3, constructing a model convolution layer; inputting the model training data set for feature extraction to obtain a feature map of an image, outputting to a subsequent recurrent neural network structure, wherein the visual features of the method are extracted through a multilayer convolutional neural network in which a convolutional layer and a maximum pool layerare staggered, the CNN accepts raw input, a feature grid V with the size of D * H * W is generated, wherein D represents the number of channels, and H and W represent the height and width of the result feature map. The method overcomes the problem of low text recognition accuracy of a neural OCR technology using a CTC-based model, can significantly reduce the amount of calculation of the network,and can ensure that the prediction precision of the model to the formula is not greatly reduced.

Description

technical field [0001] The present invention relates to the technical field of deep learning text recognition, in particular to a deep learning-based attention mechanism text recognition method. Background technique [0002] In the era of very large information, PDF and pictures account for a large part of the information, which has caused a large number of users' demand for picture and PDF text recognition, that is, optical character recognition (OCR, most commonly used to recognize natural language in images) , including various languages, handwriting, numbers, etc. Among them, there will be special signs in a large number of academic related texts, such as mathematical formulas, which are more complicated than text recognition. The recognition of mathematical formulas has become a special recognition field, and there are many difficulties. We use a real-world The formula recognition of the deep learning attention mechanism of the data set where the rendered mathematical ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06N3/04
CPCG06V20/62G06V30/10G06N3/045Y02D10/00
Inventor 杨海东黄坤山李俊宇彭文瑜林玉山魏登明
Owner FOSHAN NANHAI GUANGDONG TECH UNIV CNC EQUIP COOP INNOVATION INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products