Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Error detection and correction method for OCR recognition result of character string based on admissible set

A technology for identifying results and character strings, applied in the field of image OCR, which can solve the problems of not paying attention to the rules of the identification file number itself, no error data checking and correction, etc., to achieve the effect of correcting non-existing character strings and improving execution efficiency

Active Publication Date: 2017-05-10
西安布斯特信息科技有限公司
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The existing technical solutions do not pay attention to the own laws of the identified file numbers, let alone use their own laws to check and correct the wrong data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Error detection and correction method for OCR recognition result of character string based on admissible set
  • Error detection and correction method for OCR recognition result of character string based on admissible set
  • Error detection and correction method for OCR recognition result of character string based on admissible set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be further described below in conjunction with drawings and embodiments.

[0035] Such as figure 1 As shown, the present invention provides a method for error detection and error correction of character string OCR recognition results according to the allowed set, comprising the following steps:

[0036] 1) Input the allowed set and OCR recognition result;

[0037] 2) Repeated character strings found in the OCR recognition results;

[0038] 3) Find out the character string that exists from the OCR recognition result but does not exist in the allowed set, and record it as the non-existent character string;

[0039] 4) Find out the character string that exists from the allowed set but does not exist in the OCR recognition result, and record it as a missing character string;

[0040] 5) Repeated strings, non-existent strings and missing strings are all wrong strings in the OCR recognition results;

[0041] 6) judge whether all character string...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an error detection and correction method for the OCR recognition result of a character string based on an admissible set and belongs to the technical field of image optical character recognition (OCR). The method comprises the steps of inputting an admissible set and an OCR recognition result; finding out a repeated character string, a nonexistent character string and a missing character string as wrong character strings in the OCR recognition result; replacing characters at the corresponding positions of the above wrong character strings by position-fixed character; forming a corrected intermediate set by the replaced character strings; finding out character strings identical with missing character strings out of the intermediate set so as to form a corrected result set; and deeming character strings in the corrected result set as successfully corrected character strings. According to the technical scheme of the invention, wrong data in the recognition result are found out by the OCR software, and then the wrong data are corrected. Therefore, the accuracy of the recognition result is improved.

Description

technical field [0001] The invention belongs to the technical field of image OCR (Optical Character Recognition, optical character recognition). When OCR software is used to read and analyze image files and extract character strings therein, the present invention can check and correct misrecognized character strings, thereby assisting OCR software to reduce the error rate of recognition. Background technique [0002] OCR software refers to software that uses OCR (Optical Character Recognition) technology to convert text content on images such as pictures and photos into editable text, and is widely used to extract characters from various images. Usually, image information is acquired by scanners, cameras and other devices and stored in image files, and then OCR software reads and analyzes image files and extracts character strings in them through character recognition. [0003] A typical application of OCR software is the automatic identification of document numbers. For e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/20G06F17/30
CPCG06F16/90344G06V10/22
Inventor 史晨旭李向宁程培涛亿珍珍贺奎奎马乐赵志平聂振康焦炜李欢刘欢徐杰徐战辉陈瑞宫文天刘伟马鑫向克进许夏张宗正
Owner 西安布斯特信息科技有限公司
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More