Unlock instant, AI-driven research and patent intelligence for your innovation.

Horizontal and vertical line detection and removal for document images

A document image and horizontal line technology, applied in image enhancement, image generation, image analysis, etc., can solve the problems of image binarization degree influence, incomplete line removal, etc.

Active Publication Date: 2016-04-06
KONICA MINOLTA LAB U S A INC
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when these methods are applied to real documents, they are usually affected by the image quality and the degree of binarization of the image
Furthermore, in known line removal methods, the removal of a text underline may often change the character of the character crossed by the underline
Many known methods also suffer from incomplete line removal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Horizontal and vertical line detection and removal for document images
  • Horizontal and vertical line detection and removal for document images
  • Horizontal and vertical line detection and removal for document images

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Embodiments of the present invention provide a vertical and horizontal line removal method for document images designed to remove vertical and horizontal lines as completely as possible while preserving the characteristics of the document, especially text characters. For vertical line removal, the method is designed to preserve the vertical strokes of text characters. For horizontal line removal, especially the removal of horizontal underlines of text, the method is designed to completely remove horizontal lines while preserving text character strokes that intersect horizontal underlines. Line removal is based on stroke width and component analysis, which attempts to remove lines while maintaining character characteristics.

[0021] refer to Figure 1 to Figure 3 to describe the horizontal and vertical line detection and removal methods. The input document image is a grayscale image, where each pixel has a multi-bit pixel value, for example from 0 to 255. If the orig...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to horizontal and vertical line detection and removal for document images and provides a method of removing horizontal and vertical lines from a document. The horizontal line removal method includes: for a column of black pixels at each horizontal position along the line, removing them if their maximum stroke width is less than the median value of maximum stroke widths in a small window centered at that horizontal position; Remove connected components remaining in the horizontal line bounding box that do not extend significantly above or below the bounding box boundaries; and Perform closing operation to join broken pieces of character strokes caused by underline removal. This method preserves character strokes while removing underlines. The vertical line removal method includes: for vertical lines that have large height to width ratio, remove parts of such lines that are not at intersection of with horizontal or near-horizontal lines; remove all remaining connected components that touch neither left nor right boundary of the bounding box.

Description

technical field [0001] The present invention relates to document image processing, in particular, the present invention relates to methods for detecting and removing horizontal and vertical lines in document images. Background technique [0002] Document images generally refer to digital images representing pages of documents containing large amounts of text. Document images often contain lines, specifically, horizontal and vertical lines, such as table lines, underlining of text, and the like. Because characters (letters and other symbols) are often the focus of document image analysis (such as optical character recognition (OCR), document authentication, etc.), it is often desirable to remove lines. These lines are generally very long in one direction, and if these lines are not explicitly removed, errors and errors may be caused in the connected component analysis performed later. Various methods have been proposed for line detection and removal, such as Hough transform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/32G06K9/34
CPCG06V30/414G06V10/243G06V10/273G06T5/30G06T2207/10008G06T2207/30176G06T5/77G06T2207/20061G06T2210/12G06F40/177G06V30/412
Inventor 方刚
Owner KONICA MINOLTA LAB U S A INC