Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Content-Based Method for Estimating the Slant Angle of Document Image

A technology for tilt angle estimation and document image, which is applied in computing, computer components, instruments, etc., can solve the problems that the document image tilt angle estimation method is difficult to establish, has low versatility, and reduces accuracy, and achieves outstanding substantiveness Features, enhanced versatility, and the effect of reducing relative error

Active Publication Date: 2015-08-19
山东山大鸥玛软件股份有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the calculation accuracy of the tilt correction method used in most image recognition technologies is largely affected by the image texture, and there are many types of documents and the layout is very complex, including text, tables, images, graphics, etc., and the general document image tilt Angle estimation methods are difficult to establish
[0007] A method for estimating the tilt angle of a document image is published in the document "Content-Based Document Image Slant Correction", but it uses length smoothing preprocessing for the document image, so that the text lines are connected into a connected area. Only the straight line segment corresponding to the table can be detected in the segment, and the versatility is not strong; and the longest line segment in the document is selected as the effective straight line segment to estimate the inclination angle of the document image, which ensures the calculation speed but reduces its accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Content-Based Method for Estimating the Slant Angle of Document Image
  • A Content-Based Method for Estimating the Slant Angle of Document Image
  • A Content-Based Method for Estimating the Slant Angle of Document Image

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] Figure 2(a) is the original document image containing straight line segments, and Figure 2(b) is the noisy document image with Gaussian noise with a variance of 0.1 added to the original document image Figure 2(a).

[0043] (1) Read the document image in Figure 2(a).

[0044] (2) Binarize the document image.

[0045] Set the binarization threshold to 128, and when the pixel value is greater than or equal to 128, it is marked as 255, and when it is less than 128, it is marked as 0, then the document image becomes a binary image containing only pixels 0 and 255.

[0046] (3) Use the straight line segment detection method to detect each straight line segment in the binarized image.

[0047] Let the pixels in the foreground of the image be all 0, and the pixels in the background be 255. When any pixel in the foreground of the image has its left and right sides (or up and down sides) in continuous positions greater than the set length threshold, it is a foreground point. ,...

Embodiment 2

[0074] When there is no straight line segment in the document image or the length of the straight line segment does not reach the set threshold, as shown in Figure 5(a) is the text line image, the text line detection algorithm is used to locate the centerline position of each text line, and this is used as the The straight line segment of :

[0075] (1) Read the text image in Figure 5(a).

[0076] (2) Binarize the document image.

[0077] (3) Use the straight line segment detection method to detect each straight line segment in the binarized image.

[0078] (4) Thinning each straight line segment obtained in step (3) with a thinning algorithm.

[0079] (5) Set the threshold to 32 pixels, and use the 8-connected domain marking method to mark each straight line segment after refinement. If there is a straight line segment whose length is greater than or equal to the given threshold, the corresponding straight line segment is a valid straight line segment. Retain each effectiv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an electronic processing category of a document, and provides a document image slant angle estimation method based on content. The method comprises the following steps of: firstly, carrying out binarization processing on an acquired document image; then, acquiring each linear segment in the image by a linear segment detection method, when judging that linear segment does not exist in the image, acquiring the linear segment by respectively positioning a position of a center line of each text line by a text line detecting algorithm; meanwhile, respectively counting distribution condition of a slant angle of each linear segment by a vote algorithm, and counting the slant angle of each linear segment by a mutation signal detection method of Gaussian wavelet transform; and finally, counting the slant angle of the document by occupied weight of the slant angle of each linear segment in the slant angle of the document image. According to the document image slant angle estimation method based on the content, the slant angle of the document image is estimated by the linear segment or the text line, not only the slant angle of the document image of the linear segment can be estimated, and but also the slant angle of the document image in which the linear segment does not exist can be estimated, so that the generality is good, the stability is good, and the precision is high.

Description

technical field [0001] The invention relates to the field of document electronic measurement and processing, in particular to an estimation method for document image tilt angle measurement, in particular to a content-based document image tilt angle estimation method. Background technique [0002] As the carrier of information, documents occupy a very important position in social life. They can enter computers through scanners, digital cameras or document processing systems, and convert them into document images or electronic documents, so that people can conveniently and effectively process them. Storage, management, transmission. [0003] In real life, due to the mechanical error of paper-feeding equipment such as scanners or the influence of human factors, the acquired document image usually has a certain degree of inclination. However, the processing object required by the document processing system is a neat document image, or a document image with a known tilt angle, o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/32
Inventor 马磊刘江陈霞
Owner 山东山大鸥玛软件股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products