Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for determining average character width and method and equipment of character segmentation

A technology of character width and character aspect ratio, applied in character and pattern recognition, instrumentation, computing, etc.

Active Publication Date: 2013-05-08
CANON KK
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] The present invention aims to solve the problems described above

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for determining average character width and method and equipment of character segmentation
  • Method and device for determining average character width and method and equipment of character segmentation
  • Method and device for determining average character width and method and equipment of character segmentation

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0043] will refer to image 3 The first embodiment will be described in detail. image 3 is a flowchart showing a procedure for implementing the method for determining the average character width of a character group according to the first embodiment of the present invention. will be described in detail below image 3 .

[0044] In step S301, the ACW of the current character group (referred to as "first average character width", abbreviated as ACW1) is obtained through an ACW calculation method (referred to as the first ACW calculation method). Here, the obtained ACW1 may be a rough average character width.

[0045] For example, the rough average character width can be obtained by averaging the character widths of the initial segmentation results obtained by, for example, an image-based segmentation method. For another example, the rough average character width can also be calculated by one of the above-mentioned method 1 and method 2 or any other method.

[0046] More sp...

no. 2 example

[0056] The second embodiment according to the present invention will show a confidence degree calculation method, which can be used in the first embodiment or any other situation where the confidence degree needs to be calculated. Figure 4 is a flowchart illustrating an exemplary procedure of a confidence calculation method according to the present invention. The confidence calculation method will be described in detail below.

[0057] First, a specific ACW such as ACW1 or ACW2 described above has been obtained by using any one of the ACW calculation methods.

[0058] In step 401, a character width range (referred to as a first range) is set around the specific ACW.

[0059] For example, the first range may be represented as follows:

[0060] [minCharWidth, maxCharWidth],

[0061] in,

[0062] minCharWidth=coeffl*sAveCharWidth,

[0063] maxCharWidth=coeff*sAveCharWidth.

[0064] Among them, "minCharWidth" and "maxCharWidth" are the lower limit and upper limit of the fir...

no. 3 example

[0069] A third embodiment of the present invention will refer to Figure 5 To illustrate another confidence calculation method, the confidence calculation method can be used in the method for determining the average character width described above. Figure 5 is a flowchart showing an exemplary procedure of another confidence calculation method according to the present invention. The confidence calculation method will be specifically described below.

[0070] First, similarly to the second embodiment, a specific average character width has been obtained. In step 501, by using a clustering method, according to the width of the characters in the initial segmentation result as described above, they are clustered into different groups. Here, the clustering method is not limited to a specific method, and any clustering method that can divide characters into different groups is applicable to the present invention. For ease of understanding, an example of a clustering method is sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a device for determining an average character width and a method and equipment of character segmentation. The method for determining the average character width of a character set comprises the following steps: obtaining a first average character width of the character set; obtaining a confidence coefficient of the first average character width by a method of the confidence coefficient, wherein the confidence coefficient of the first average character width is used for presenting the degree of closeness between the first average character width and a real average character width of the character set; and determining whether the average character width of the character set is the first average character width or a second average character width according to the confidence coefficient of the first average character width.

Description

technical field [0001] The present invention relates to a character segmentation method and apparatus for segmenting characters in a document image (in particular, text lines or text columns), and more particularly, to a method for performing character segmentation by using an average character width And apparatus, said average character width is obtained by the method and apparatus for determining the average character width of a text line or a text column in a document image. Background technique [0002] In an Optical Character Recognition (OCR) system, character segmentation is usually performed, for example, by using a method of "black pixel projection". However, when using this method, two kinds of segmentation faults will occur. One segmentation error is "unable to detect segmented points of connected characters", and another segmentation error is "a character consisting of at least two parts with whitespace between these parts (hereinafter, for brevity, such will b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/34
Inventor 许梅芳罗兆海
Owner CANON KK