Character post processing method and device based on regular expression

An expression and text technology, applied in the field of text recognition, can solve problems such as missing first and last characters in text recognition results, and failure to obtain recognition results, etc., achieving flexible and convenient settings, avoiding segmentation errors, and good versatility

Active Publication Date: 2012-08-15
HANVON CORP
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If it is applied to text post-processing, when there is no exact match among the candidate characters, the recognition result cannot be obtained
[0004] In addition, none of the current text post-processing methods can solve the problem of missing first and last characters in text recognition results caused by wrong text segmentation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character post processing method and device based on regular expression
  • Character post processing method and device based on regular expression
  • Character post processing method and device based on regular expression

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The specific implementation manners of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0048] The method of the present invention formulates a set of grammar describing post-processing criteria based on regular expressions, and the grammar can meet the requirements of most text post-processing. At the same time, design a corresponding post-processing matching method for each grammatical element in the above grammar, match and score the text recognition results, and extract the text recognition results that best meet the post-processing criteria, so as to correct the text recognition results and improve the accuracy of text recognition. Purpose.

[0049] Such as figure 1 As shown, the text post-processing method based on regular expressions described in the present invention, the specific process is described as follows:

[0050] Step 1. Set the post-processing criterion expression of the current recognition area...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a character post processing method and a character post processing device based on a regular expression and belongs to the field of character identification. The method and the device are designed by aiming at the defects of poor reusability and expandability and the like of the existing character post processing method. The method provided by the invention comprises the following steps that: a post processing criterion expression of a current identification region is set according to the post processing criterion grammar; the post processing criterion expression is analyzed to obtain an arborescence data structure; identification results are matched; and character post processing results with the highest match value are obtained. According to the method, the grammar elements of the regular expression are utilized for describing the post processing criterion of the character identification results with different post processing requirements, and good universality, expandability and expression capability are realized, so the setting on the post processing criterion is flexible, convenient and fast.

Description

technical field [0001] The invention belongs to the field of character recognition, in particular to a method and device for post-processing characters based on regular expressions. Background technique [0002] Text post-processing refers to the process of screening out the recognition result character string that best meets the criteria requirements according to the preset post-processing criteria after obtaining the recognition result candidates. In previous post-processing methods, the setting of post-processing criteria and the corresponding criteria matching methods are mostly designed according to actual needs. For example, the essence of text post-processing methods based on Optical Character Recognition (OCR) is to convert text images into text recognition results. If the text content has a certain semantic meaning, post-processing criteria can be used to correct it. Recognition result: If the text is an ID number, the post-processing can correct the recognition re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/20
Inventor 王晓健
Owner HANVON CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products