Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text Recognition Method Fused with Multi-layer Feature Enhanced Attention Mechanism

A text recognition and feature enhancement technology, applied in the field of optical character recognition, can solve problems such as different results, complexity, and slow recognition speed, and achieve the effect of increasing relevance and performance

Active Publication Date: 2022-06-28
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of this method is that the recognition accuracy is high; but due to its complex network structure, the recognition speed is slow
[0006] Although with the development of deep learning, the accuracy of text recognition under the action of neural network is getting higher and higher, but because the effect of recognition is closely related to the structure of the network, different network structures have different effects on the features extracted from the same picture. are quite different, and therefore give different results
Especially when the network structure is particularly complex and the number of network layers is deep, because the extracted features are too abstract, the accuracy of the final prediction result is actually lower than other methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text Recognition Method Fused with Multi-layer Feature Enhanced Attention Mechanism
  • Text Recognition Method Fused with Multi-layer Feature Enhanced Attention Mechanism
  • Text Recognition Method Fused with Multi-layer Feature Enhanced Attention Mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0082] For the convenience of description, first explain the relevant technical terms that appear in this embodiment:

[0083] reshape: Reconvert the shape of the matrix to a new shape;

[0084] LSTM (Long short-term memory): long short-term memory, a special recurrent neural network

[0085] CTCLoss (Connectionist Temporal Classification loss): A loss function that aligns output in text recognition;

[0086] argmax: a function that finds parameters (sets) for functions;

[0087] softmax: mapping function, which maps the output of multiple neurons to (0-1);

[0088] synthtext: a synthetic dataset for text recognition;

[0089] mjsynth: a synthetic dataset for text recognition;

[0090] ICDAR2013: A public real scene text recognition dataset;

[0091] ICDAR2015: A public real scene text recognition dataset;

[0092] IIIT: A publicly available real-scene text recognition dataset;

[0093] SVT: A publicly available real-world text recognition dataset.

[0094] see Figure ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to the technical field of optical character recognition in computer vision, and provides a text recognition method that integrates multi-layer features to enhance the attention mechanism. The method includes: selecting training pictures; extracting picture features; constructing a feature fusion matrix and fusing multi-layer features ; Use associated features for feature fusion to enhance feature performance; sequence modeling for fused features; probability prediction for sequence modeling features; in the training phase, use backpropagation to update the parameter weights of the network model , to obtain a standard network model that can be used for text recognition; in the test phase, the picture to be recognized is input into the trained network model, and the model recognizes and outputs the text in the picture. The present invention improves the expressive ability of the features by mapping the features extracted from each level of the neural network to each other, thereby improving the accuracy of character recognition.

Description

technical field [0001] The invention relates to the technical field of optical character recognition in computer vision, in particular to a character recognition method that integrates multi-layer features to enhance attention mechanism. Background technique [0002] In the era of the mobile Internet, a large amount of picture data can be sent and received every day, many of which contain text information, and it is particularly important to be able to accurately extract the text information in the pictures. People may need to convert the manuscripts taken by mobile phones into electronic versions, or they may need to save the text in the pictures they usually see, and so on. As the number of pictures increases, the text in the pictures also increases, and it has gradually become a new trend to be able to accurately recognize the text in the pictures. Text recognition is mainly to process the part of the picture with the text area, convert the color information in the pictu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V30/14G06V30/18G06V30/19G06V10/82G06K9/62G06N3/04G06N3/08
CPCG06N3/049G06N3/084G06V10/22G06V10/40G06F18/2415
Inventor 徐行赖逸沈复民邵杰申恒涛
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products