Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for eliminating interference of seal on document extraction

A seal and document technology, applied in the field of eliminating the interference of seals on document extraction, can solve the problems of misfiltering effective information, poor effect, poor model generalization, etc., to reduce the probability of comparison errors, reduce multiple recognition and misidentification , the effect of improving the accuracy

Pending Publication Date: 2022-02-18
达观数据(苏州)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The first method is simple and effective, but it does not work well for stamps that are not pure red. If the removal threshold is blindly expanded, it will affect the extraction of other valid text information, especially when there is valid red text in the document. The method will mistakenly filter these valid information, causing errors in subsequent business processes
[0006] The second method has a good effect on complex scenes, but its model is heavy and consumes more system resources; in addition, the training of generative adversarial networks to remove stamps requires a large amount of data support, and only labeling real samples is often not enough. demand, because in addition to marking the scanned document with the seal, it is also necessary to restore the original document before the seal
Therefore, the training data often needs to be artificially created, but the fabricated data is fake data after all, and there are inevitably differences from the real data, which also leads to poor generalization of the trained model
In addition, the generative confrontation network is not perfect when restoring text, and some pixels will be missing and distorted, which will undoubtedly reduce the recognition effect of subsequent models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for eliminating interference of seal on document extraction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0022] In describing the present invention, it should be understood that the terms "longitudinal", "transverse", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship indicated by "horizontal", "top", "bottom", "inner", "outer", etc. are based on the orientation or positional relationship shown in the drawings, and are only for the convenience of describing the present invention and simplifying the descriptio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for eliminating interference of a seal on document extraction, and the method aims at an image P with the seal, and comprises the steps: employing an image processing method for the image P to eliminate the seal, and obtaining an image Q; performing text detection on the image Q to obtain coordinates of a text area; and performing text recognition by taking a region of the image P with the coordinates as a text region. According to the method, in the bank flow intelligent auditing business process, the text detection effect is optimized by eliminating seal interference, and a more accurate digital recognition result can be obtained; in the contract intelligent comparison business process, multi-identification and misidentification caused by seal interference can be reduced, so that the probability of subsequent comparison errors is reduced.

Description

technical field [0001] The invention belongs to the field of optical character recognition, and in particular relates to a method for eliminating the interference of stamps on document extraction. Background technique [0002] Optical character recognition technology has been widely used in business fields such as document intelligent review and comparison of electronically scanned documents. The accuracy of the extraction results directly determines the effect of document intelligent review, comparison, and other services. The higher the extraction accuracy, the simpler the post-processing logic of the corresponding business, and the stronger the robustness of the system. [0003] Electronic scanned documents are often stamped with various official seals and signature stamps. When the stamp is stamped on the text, it will interfere with the model's detection and recognition of the text, resulting in errors in the extraction results, making the originally normal document rev...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V10/26G06V20/62G06V30/10
Inventor 潘新星陶提黄登高翔陈运文纪达麒
Owner 达观数据(苏州)有限公司