Unlock instant, AI-driven research and patent intelligence for your innovation.

Longest common substring automatic error correction method and system based on OCR recognition result

A longest common substring and automatic error correction technology, applied in the field of image optical character recognition, can solve the problems of not paying attention to its own rules and business rules, no error data checking and correction, difficult to apply to business production environment, etc.

Pending Publication Date: 2020-05-08
上海迈弦网络科技有限公司
View PDF7 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method uses multiple OCR software to identify the wrongly identified data by voting. It does not pay attention to the own rules of each field in the identified file content and the business rules between fields, and does not use these rules and rules to correct the wrong data. Implement inspection and correction, so when identifying electronic fax image files with very low resolution, the success rate of image file recognition output is very unsatisfactory, and it is difficult to be practically applied in a business production environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Longest common substring automatic error correction method and system based on OCR recognition result
  • Longest common substring automatic error correction method and system based on OCR recognition result
  • Longest common substring automatic error correction method and system based on OCR recognition result

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] Embodiments of the present invention are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

[0038] It should be noted that the diagrams provided in the following embodiments are only schematically illustrating the basic ideas of the present invention, and only the components related to the present invention are shown in the diagrams rather than the number, shape and shape of the components in ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a longest common substring automatic error correction method and system based on an OCR recognition result, and relates to the field of image optical character recognition, andthe method comprises the following steps: step 1, obtaining a character string in a to-be-detected image file through OCR software recognition; step 2, preprocessing the character string obtained by the OCR software identification; step 3, performing character error correction replacement processing on the preprocessed character string; and step 4, performing longest common substring matching calculation processing based on the character string subjected to character error correction replacement processing, and outputting a correct result. According to the method, automatic error correction and replacement are carried out on the character string recognized and input by the OCR software, then longest common substring matching calculation is carried out on the character string subjected to error correction and replacement and the target character string needing to be output, and the correct target character string is output. The problem that when the image file with low definition is recognized, the image file recognition output success rate is low is solved.

Description

technical field [0001] The invention relates to the field of image optical character recognition, in particular to an automatic error correction method and system for the longest common substring based on OCR recognition results. Background technique [0002] OCR software refers to software that uses OCR (Optical Character Recognition) technology to recognize, extract and convert text content on pictures, photos, electronic faxes and other images into editable text, through scanners, cameras, electronic fax machines, etc. The device acquires and saves image files, and then reads and analyzes image files through OCR software and extracts character strings through character recognition. [0003] When digitally managing various documents such as tax invoices, contracts, fund transaction notes, and transfer instructions, OCR software is required to automatically identify the image content of the fixed area of ​​​​the document and extract information such as the account and amoun...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/20G06K9/68
CPCG06V10/22G06V30/244G06V30/248G06V30/1983G06V30/10
Inventor 叶瑞叶凯迪陆爱亮
Owner 上海迈弦网络科技有限公司