Method and device for extracting characters of document

An extraction method and document technology, applied in the field of image processing, can solve problems such as insufficient universality, and achieve the effect of wide application and accurate extraction

Inactive Publication Date: 2017-11-17
INST OF AUTOMATION CHINESE ACAD OF SCI +1
View PDF11 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in practical applications, there are many overlaps and intersections between the stamps in the image and the text content of the document, the location of the stamps is random, and the types of stamps are diverse. The existing methods are not universal enough to solve these problems.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for extracting characters of document
  • Method and device for extracting characters of document
  • Method and device for extracting characters of document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] Based on the above technical problems, the present invention provides a document text extraction method, which can process images containing document text areas and patterns, avoid interference of patterns on document text areas, and separate and obtain pure document text areas. In addition, the document text extraction method adopts computer vision and image processing technology, which is suitable for images obtained by scanners, images obtained by imaging equipment such as high-speed cameras, and can also be used for electronic official document images, and its application range is wide , is universal.

[0036] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0037] An aspect of the embodiments of the present invention provides a method for extracting document text, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for extracting the characters of a document, and the method comprises the steps: extracting a region with the same color as the color of a pattern from an image comprising a document character region and the pattern; carrying out the gray scale processing of the image, and obtaining a foreground region of the image, wherein the foreground region comprises the document character region and the pattern; extracting the contour shape of the pattern from the foreground region, wherein the color of a character document part in the contour shape is different from the color of the pattern; carrying out the fusion of the contour shape and the region with the same color, obtaining a public region, removing the public region in the foreground region, and obtaining the image which just comprises the document characters. The invention also provides a device for extracting characters of the document. The device employs the computer vision technology and the image processing technology, is suitable for image scanning, is also suitable for an image obtained through imaging equipment, also can be used for an electronic official document image, is wide in application range, and is universal.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to a document text extraction method and an extraction device. Background technique [0002] In order to improve work efficiency, automatic entry of basic document information is an important part of information protection for relevant departments. There are relatively mature technologies and products for automatic recognition of text in conventional documents. However, for images covered with patterns on the document text area, the technical maturity of automatic extraction and recognition of document text content needs to be improved. For example, the image of the stamp covering the text of the document, due to the interference of physical factors such as uneven force, difference in ink pad quality, uneven thickness of paper, limitations of the characteristics of imaging equipment such as scanners, cameras or cameras, as well as lighting, shooting, etc. Influenced by th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/38
CPCG06V30/40G06V10/28
Inventor 王彦情崔晓光张吉祥
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products