Method for removing headers and footers based on Hough transform straight line detection

A technology of straight line detection and Hough transform, applied in image data processing, instruments, calculations, etc., can solve the problems of inadaptability to multi-plate contracts and low efficiency, and achieve the effect of improving extraction accuracy and recognition efficiency

Pending Publication Date: 2022-01-28
SHENZHEN QIANHAI HUANRONG LIANYI INFORMATION TECH SERVICES CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] 1. This technology is suitable for fixed text pictures in the form of plates, but not suitable for text images of diverse plate forms;
[0004] 2. This technology needs to count the line height and line spacing of each picture to identify the text blocks of the header and footer, and the efficiency is relatively low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for removing headers and footers based on Hough transform straight line detection
  • Method for removing headers and footers based on Hough transform straight line detection
  • Method for removing headers and footers based on Hough transform straight line detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0027] see figure 1 , the present invention provides a method for removing headers and footers based on Hough transform line detection, comprising the following steps:

[0028] Step 1: Preprocess the read-in contract text image.

[0029] There are a large number of texture noise images in the same type of text images provided by users. The texture noise images will cause strong interference to the subsequent edge detection and line detection. Therefore, it is necessary to predict the read images and reduce the texture noise images. noise.

[0030] The frequency of the image is an index that characterizes the intensity of the grayscale change in the image, and it is the gradient of the grayscale in the plane space. Generally speaking, if the gradient is large, the brightness of the point is strong, otherwise the brightness of the point is w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for removing headers and footers based on Hough transform straight line detection. The method comprises the following steps of: extracting edge information of a contract text image based on a phase consistency method, detecting transverse lines at the headers and footers through Hough transform straight lines, and filling the areas at the headers and footers with background colors so as to remove the headers and footers. The extraction precision of subsequent text information is improved, the recognition requirements of diverse plate type contract text images are met, and the recognition efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of OCR character recognition, in particular to a method for removing headers and footers based on Hough transform line detection. Background technique [0002] OCR (optical character recognition) character recognition refers to the process in which electronic equipment (such as a scanner or digital camera) checks the characters printed on paper, and then uses the character recognition method to translate the shape into computer text; that is, the text data is scanned, and then Image files are analyzed and processed to obtain text and layout information. In the business process, the contract text image uploaded by the user contains header and footer information, and the contract name and company name text information in the header and footer information will cause great interference to the extraction of key information of the subsequent text. In the contract text image provided by the user, due to the variet...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06T7/168G06V30/148G06T7/12G06T7/13
CPCG06T7/168G06T7/13G06T7/12G06T2207/20061
Inventor 石朵伟陈淑华
Owner SHENZHEN QIANHAI HUANRONG LIANYI INFORMATION TECH SERVICES CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products