Character information identification device and method

A character information and recognition device technology, applied in character and pattern recognition, instrumentation, computing, etc., can solve the problems of low recognition rate of the entire string of Email addresses, large differences in the deformation of handwritten characters, and low confidence in character output.

Inactive Publication Date: 2009-09-02
FUJITSU LTD
View PDF1 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method is often called recognition-based segmentation, but it is very dependent on the performance of the classifier, requiring the classifier to have a high degree of confidence in the output of complete characters and a low degree of confidence in the output of characters with incomplete or redundant strokes
Different from machine-printed characters, handwritten characters often have very different deformations, and it is difficult for the classifier to meet the above performance requirements. Therefore, this method does not have a high recognition rate for the entire string of Email addresses.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character information identification device and method
  • Character information identification device and method
  • Character information identification device and method

Examples

Experimental program
Comparison scheme
Effect test

no. 1 approach

[0081] The following takes the identification of an Email address as an example to describe the first embodiment of the present invention in detail.

[0082] figure 1 It is a schematic structural block diagram of the character recognition device 1 according to the first embodiment of the present invention. The input of this character recognition device 1 is image data of a handwritten character string, and the recognized character string is input. Such as figure 1 As shown, the character recognition device 1 includes: a segmentation unit 10 , a delimiter recognition unit 20 , a character recognition unit 30 , and a dictionary database 40 . The character recognition device 1 can be externally connected to equipment such as a digital camera, a scanner, a PDA, and a mobile phone, and input a scanned or handwritten Email address character string image. The segmentation unit 10 segments the character string image into a plurality of independent segments. The delimiter identifyi...

no. 2 example

[0136] An exemplary second embodiment of the present invention will be described below.

[0137] The basic structure of the character recognition device 1 of the second embodiment is the same as that of the above-mentioned first embodiment, including a segmentation unit 10, a delimiter recognition unit 20, a character recognition unit 30 and a dictionary database 40, and the difference is that the character recognition unit 30 is processing performed. The processing performed by the character recognition unit 30 in the second embodiment will be described in detail below. In the following description, the same reference numerals are assigned to the same or corresponding parts as those of the first embodiment, and repeated descriptions are omitted.

[0138] In the above first embodiment, if Figure 9 As shown, the character recognition unit 30 recognizes each segment separated by a delimiter along different segmentation paths, that is, recognizes in units of segmented segments...

no. 3 example

[0144] An exemplary third embodiment of the present invention will be described below.

[0145] The character recognition device 13 of the third embodiment is an improvement of the first or second embodiment described above. The character recognition device 13 of the third embodiment includes the same segmentation unit 10, separator recognition unit 20, character recognition unit 30 and dictionary database 40 as the above-mentioned first or second embodiment, and the difference is that it also includes a correction unit 50 and post-processing unit 60. The character recognition device 13 of the third embodiment will be described in detail below. In the following description, the same reference numerals are given to the parts that are the same as or correspond to those of the first and second embodiments, and repeated descriptions are omitted.

[0146] In the third embodiment, before the recognition process, the input character string image is first corrected by the correction...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a character information identification device and a character information identification method, a character string image which is input by the invention and includes a character string identifies the character string which is divided into more than two fields by a separator. The character information identification device of the invention comprises a segmentation unit used for dividing the character string image into a plurality of segments, a separator identification unit which is used for identifying the separator based on the divided segments, thus dividing the plurality of segments into a plurality of segment sets, a dictionary database, in which a plurality of predetermined character sets are stored, and an identification unit which identifies at least part of the segment set as the character sets in the dictionary database for each segment set, thus indentifying each field. According to the invention, the identification precision of the information such as hand-written Email address, network address and the like can be greatly improved, even though under the condition that stroke splicing exists, the identification can be carried out under satisfactory precision.

Description

technical field [0001] The present invention relates to a character information recognition device and method, that is, a device and method for recognizing character strings from character images. More specifically, the present invention relates to an apparatus and method for identifying a character string that is divided into a plurality of fields by a delimiter and at least a part of the fields has a fixed pattern. Background technique [0002] Nowadays, it is very common to recognize various character information through OCR technology. For example, a user writes a string of characters on paper or a touch screen, and converts it into a string image by scanning, taking a photo, or sensing, and then inputs the string image into the recognition system to recognize and output the string value. [0003] There is such a segmented or hierarchical information such as email address, network address, etc. Such a string is separated into more than two fields by a delimiter, and so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/72G06K9/00
Inventor 郑大念孙俊直井聪堀田悦伸皆川明洋藤本克仁
Owner FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products