Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Character recognition method based on an attention mechanism and linkage time classification loss

A text recognition and attention technology, applied in the field of optical character recognition, can solve the problems of single character recognition out of context, low versatility, and time-consuming production.

Inactive Publication Date: 2019-03-19
HANGZHOU DIANZI UNIV
View PDF6 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above methods still have the following problems: 1) The recognized features rely on manual definition, and the artificially defined features are difficult to capture the deep semantics of the picture, and the production is time-consuming and not very versatile
2) Recognition based on a single character will be out of context, which can easily lead to ambiguity
3) The text structure is complex and the semantics are changeable, and the characters need to be segmented and preprocessed, and the forced segmentation will destroy the character structure
The selection of the classification dictionary directly affects the recognition results, resulting in poor generalization ability of the recognition model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character recognition method based on an attention mechanism and linkage time classification loss
  • Character recognition method based on an attention mechanism and linkage time classification loss
  • Character recognition method based on an attention mechanism and linkage time classification loss

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0080] The present invention will be further described below in conjunction with accompanying drawing.

[0081] like figure 1 As shown, the text recognition method based on the attention mechanism and the connection time classification loss of the present invention, the specific implementation steps are as follows:

[0082] S1: Collect data sets. Collect texts in various natural scenes and combine these texts. Divide the data set into three parts: training data set, verification data set, and test data set. In order to ensure that the training set, verification set and test set have the same sample distribution, the original data set is first scrambled, and then split proportionally, with a split ratio of 7:2:1. The training data set is used to optimize the model parameters, the validation data set is used for model selection, and the test data set is used for the final evaluation of the model. Denote the selected data set as T={(x 1 ,y 1 ),(x 2 ,y 2 ),…,(x N ,y N )}...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a character recognition method based on an attention mechanism and linkage time classification loss. The character recognition method comprises the following steps that S1, a data set is collected; s2, carrying out preprocessing such as scale zooming, gray level conversion and pixel normalization on the image sample; s3; processing the tag sequence of the sample, includingfilling, coding and word embedding; s4, constructing a convolutional neural network, and performing feature extraction on the text image processed in the step S3; s5, encoding the features extracted in the step S4 by using a stacked bidirectional recurrent neural network to obtain encoded features; s6, inputting the coding features obtained in the S5 into a connection time classification model tocalculate a prediction probability; and S7, calculating weights of different coding characteristics by using an attention mechanism to obtain coded semantic vectors.

Description

technical field [0001] The invention belongs to the field of optical character recognition, in particular to a character recognition method based on attention mechanism and connectionist time classification. Background technique [0002] With the popularization of intelligence and mobile terminals, the semantic information of natural scene images plays an increasingly important role in the fields of automatic driving, intelligent transportation, and visual assistance. [0003] To solve the problem of text image recognition in natural scenes, Strokelets obtains stroke features in text by clustering image blocks, uses HOG features to detect characters, and combines random forest classifiers to classify characters. The PhotoOCR system uses the HOG feature classifier to score the candidate results obtained by segmentation, and then combines the BeamSearch algorithm of the N-gram language model to obtain the candidate character set, and finally uses the language model and shape m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06K9/34G06N3/04
CPCG06V30/153G06V30/10G06N3/045G06F18/241G06F18/214
Inventor 和文杰潘勉
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products