Method and device for touching character segmentation in character recognition

A technology of sticky characters and text recognition, applied in the field of text recognition, can solve the problems of accuracy dependence, overlapping of strokes, and unreliable recognition of handwritten characters

Inactive Publication Date: 2011-08-31
HANVON CORP
View PDF3 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] There are many ways to detect segmentation points. The detection of segmentation points in printed text recognition is relatively simple, but in handwritten text recognition, the situation of handwriting adhesion is more complicated. There are many false peaks and troughs in the outline, and the real ones There may be adhesion between the upper contour and the lower contour on the segmentation point, which makes the contour change relatively smooth, so it is not reliable to determine the segmentation point only by contour information
[0004] At present, there are mainly two methods to find the segmentation points of offline handwritten characters. The first method is based on the connected domain, and judges the connected domain t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for touching character segmentation in character recognition
  • Method and device for touching character segmentation in character recognition
  • Method and device for touching character segmentation in character recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0049] A segmentation method for cohesive characters in text recognition, such as figure 1 shown in the following steps:

[0050] Step 1: Preprocess the input line image to obtain the connected domain, average word width and average word height of the line image. Such as figure 2 As shown, it is a row of input images, such as image 3 As shown, it is the connected domain block obtained after preprocessing the image.

[0051] The preprocessing includes denoising the row image, obtaining the connected domain of the row image, smoothing the width histogram and height histogram of the connected domain respectively, and taking the peak value as the average word width and average word height.

[0052] Will figure 2After denoising...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method and a device for touching character segmentation in character recognition, and belongs to the character recognition field. The method comprises the steps of carrying out preprocessing to obtain connected domains, an average character width and an average character height of a row image; carrying out connected domain analysis, marking touching connected domains, carrying out stroke extraction for selected touching connected domains, carrying out segmentation point detection of the touching connected domains to obtain pre-segmentation points, and saving character blocks for non-touching connected domains; merging extracted strokes according to the pre-segmentation points to obtain the character blocks; saving the character blocks and turning to a next connected domain for carrying out touching determination, and outputting a character block sequence after all connected domains having been traversed; and merging the character blocks according to reference information and outputting recognition result. The method and the device provided in the invention merge the strokes according to the pre-segmentation points to obtain the character blocks, guaranteeing that segmentation points in a larger scope can be detected, and take the mode that contour information is used to predetect the segmentation points as a parameter in merging, avoiding merging mistakes caused by merging correct segmentation points.

Description

technical field [0001] The invention belongs to the field of character recognition, and relates to a character segmentation method and device, in particular to a character segmentation method and device for character recognition. Background technique [0002] In the text recognition process, character segmentation is a very important part. Text recognition, especially offline handwritten text recognition, generally adopts an over-segmentation method, that is, first detects multiple possible segmentation points, and then merges the segmentation points through geometric, recognition or semantic information to search for the optimal Split paths. [0003] There are many ways to detect segmentation points. The detection of segmentation points in printed text recognition is relatively simple, but in handwritten text recognition, the situation of handwriting adhesion is more complicated. There are many false peaks and troughs in the outline, and the real ones There may be adhesio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/34
Inventor 王琛
Owner HANVON CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products