Segmentation-free off-line handwritten Chinese character text recognition method

A text recognition and Chinese character technology, applied in the field of text recognition, can solve the problems of complex positional relationship, excessive segmentation, wrong segmentation, etc., to achieve the effect of improving accuracy and robustness, reducing loss, and improving accuracy.

Inactive Publication Date: 2018-09-07
WUYI UNIV
View PDF6 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Off-line handwritten text recognition is one of the current problems in the field of text recognition. Compared with online handwritten recognition, it lacks the necessary character position and trajectory information. The latter can be approximated according to character position and writing experience. Therefore, character The determination of the position has a great influence on the recognition efficiency of offline handwritten text. Due to the randomness of handwritten characters, the positional relationship between adjacent characters is complicated, which makes the positioning of characters in offline handwritten texts much more difficult than that of printed characters. Especially character position judgment in text with skewed lines, irregular line fragments, and glued characters
[0003] At present, for the recognition of handwritten text lines, traditional methods are mainly based on character segmentation and single-character recognition solutions. Character segmentation can divide handwritten Chinese character text lines into basic handwritten Characters are sent to a single-character classifier to obtain the recognition results of the entire line. Commonly used segmentation techniques include statistical segmentation methods, font structure segmentation methods, and recognition-based statistical segmentation methods for Chinese characters. Among them, statistical segmentation methods The segmentation method is to determine the boundaries between characters according to the overall statistical characteristics of the characters. When distinguishing, the average width of the characters is used as an auxiliary judgment. The representativeness and stability of the statistical distribution characteristics play a role in the correctness and convergence of the segmentation. It is very important. This method is suitable for the segmentation of characters with wide character spacing and no sticky characters. Because the strokes of handwritten Chinese characters are scattered or there are consecutive strokes between characters, it is easy to cause excessive segmentation or wrong segmentation. This makes the recognition of handwritten Chinese characters more difficult; and for the recognition of single-character handwritten Chinese characters, due to the large number of Chinese character categories and the diversity of handwritten Chinese characters, the difficulty of single-character handwritten Chinese character recognition is also great

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Segmentation-free off-line handwritten Chinese character text recognition method
  • Segmentation-free off-line handwritten Chinese character text recognition method
  • Segmentation-free off-line handwritten Chinese character text recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The specific embodiment of the present invention will be further described below in conjunction with accompanying drawing:

[0035] Such as figure 1 As shown, a non-segmented offline handwritten Chinese character text recognition method includes the following steps:

[0036] S1), preprocessing the offline handwritten Chinese character text image, the preprocessing includes image size normalization processing and image brightness value inversion, and the text image width is processed to 128 by image size normalization processing, due to the collected The background color of the offline handwritten Chinese character text image is white, and the brightness value is 255. In order to reduce the amount of calculation, the background of the text image and the brightness of the Chinese characters are reversed by inverting the image brightness value, specifically: I(i,j)=255 -X(i,j), where X(i,j) is the brightness value corresponding to the pixel position in row i, column j of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a segmentation-free off-line handwritten Chinese character text recognition method. The method comprises the steps that (S1) an off-line handwritten Chinese character text image is preprocessed; (S2) a spatial transformation network model is constructed; (S3) a deep convolutional neural network model is constructed; (S4) a recurrent neural network model is constructed through depth features extracted by the deep convolutional neural network model; (S5) probability distribution of sequence tags is output through a classifier CTC; and (S6) greedy search and search basedon dictionary rules are adopted to obtain a final text recognition result. According to the method, by the adoption of a model combining a spatial transformation network, a deep convolutional neural network and a recurrent neural network, correction processing and segmentation-free recognition can be performed on text lines with large offset, and the accuracy and robustness of recognition of complicated text lines are improved; the whole model framework is solved based on an iterative algorithm without the need for complicated excessive segmentation preprocessing, therefore, losses brought byan excessive segmentation method can be well reduced, entire model parameters can be optimized in a united mode, and recognition accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of text recognition, in particular to a non-segmented off-line handwritten Chinese character text recognition method. Background technique [0002] Off-line handwritten text recognition is one of the current problems in the field of text recognition. Compared with online handwritten recognition, it lacks the necessary character position and trajectory information. The latter can be approximated according to character position and writing experience. Therefore, character The determination of the position has a great influence on the recognition efficiency of offline handwritten text. Due to the randomness of handwritten characters, the positional relationship between adjacent characters is complicated, which makes the positioning of characters in offline handwritten texts much more difficult than that of printed characters. In particular, character position determination in text with skewed lines, irregular l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V30/32G06V30/287G06N3/045
Inventor 应自炉陈鹏飞朱健菲陈俊娟甘俊英翟懿奎
Owner WUYI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products