Character recognition method for enhancing attention mechanism by fusing multilayer features

A text recognition and feature enhancement technology, applied in the field of optical character recognition, can solve problems such as different results, complexity, and slow recognition speed

Active Publication Date: 2021-05-11
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of this method is that the recognition accuracy is high; but due to its complex network structure, the recognition speed is slow
[0006] Although with the development of deep learning, the accuracy of text recognition under the action of neural network is getting higher and higher, but because the effect of recognition is closely related to the structure of the network, diff

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character recognition method for enhancing attention mechanism by fusing multilayer features
  • Character recognition method for enhancing attention mechanism by fusing multilayer features
  • Character recognition method for enhancing attention mechanism by fusing multilayer features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0081] For the convenience of description, first explain the relevant technical terms that appear in this embodiment:

[0082] reshape: Reconvert the shape of the matrix to a new shape;

[0083] LSTM (Long short-term memory): long short-term memory, a special recurrent neural network

[0084] CTCLoss (Connectionist Temporal Classification loss): A loss function that aligns output in text recognition;

[0085] argmax: a function that finds parameters (sets) for functions;

[0086] softmax: mapping function, which maps the output of multiple neurons to (0-1);

[0087] synthtext: a synthetic dataset for text recognition;

[0088] mjsynth: a synthetic dataset for text recognition;

[0089] ICDAR2013: A public real scene text recognition dataset;

[0090] ICDAR2015: A public real scene text recognition dataset;

[0091] IIIT: A publicly available real-scene text recognition dataset;

[0092] SVT: A publicly available real-world text recognition dataset.

[0093] see Figur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of optical character recognition in computer vision, and provides a character recognition method for enhancing an attention mechanism by fusing multilayer features. The method comprises the following steps: selecting a training picture; extracting picture features; constructing a feature fusion matrix and fusing multilayer features; carrying out feature fusion by using the associated features to enhance the feature expression ability; carrying out sequence modeling on the fused features; performing probability prediction on the features after sequence modeling; in the training stage, adopting back propagation to update the parameter weight of a network model, and obtaining a standard network model capable of being used for character recognition; in a test stage, inputting a to-be-recognized picture into the trained network model, recognizing the model, and outputting characters in the picture. According to the method, the features extracted from each level of the neural network are mutually mapped, so that the expression ability of the features is improved, and the accuracy of character recognition is improved.

Description

technical field [0001] The invention relates to the technical field of optical character recognition in computer vision, in particular to a character recognition method that integrates multi-layer features to enhance attention mechanism. Background technique [0002] In the era of the mobile Internet, a large amount of picture data can be sent and received every day, many of which contain text information, and it is particularly important to be able to accurately extract the text information in the pictures. People may need to convert the manuscripts taken by mobile phones into electronic versions, or they may need to save the text in the pictures they usually see, and so on. As the number of pictures increases, the text in the pictures also increases, and it has gradually become a new trend to be able to accurately recognize the text in the pictures. Text recognition is mainly to process the part of the picture with the text area, convert the color information in the pictu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/20G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06N3/049G06N3/084G06V10/22G06V10/40G06F18/2415
Inventor 徐行赖逸沈复民邵杰申恒涛
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products