Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Document Image Binarization Method Based on Support Vector Machine

A support vector machine and document image technology, which is applied to computer components, instruments, calculations, etc., can solve the problems of low-quality document image binarization and high computational complexity.

Inactive Publication Date: 2019-04-12
HUBEI UNIV OF TECH
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Researchers at home and abroad have also proposed many other methods, such as background estimation method, local contrast method, stroke edge detection method, gradient normalization and saliency map method, texture analysis method, Laplace energy method, error diffusion method, spectral Clustering methods and hybrid algorithms, etc., most of which have relatively high computational complexity, and cannot well solve the problem of binarization of low-quality document images affected by degradation factors such as ink infiltration, page stains, and background textures, or Can only be applied to some specific scenes (such as uneven lighting conditions)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Document Image Binarization Method Based on Support Vector Machine
  • A Document Image Binarization Method Based on Support Vector Machine
  • A Document Image Binarization Method Based on Support Vector Machine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0070] please see figure 1 , a kind of document image binarization method based on support vector machine provided by the invention, comprises the following steps:

[0071] Step 1: Grayscale the color image (for grayscale images, this step can be omitted);

[0072] At present, researchers mainly use methods such as component weighted average, average value, and maximum value to grayscale color images. These methods are more based on modeling of human visual characteristics.

[0073] The present invention uses the minimum mean value method to grayscale the c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a document image binarization method based on a support vector machine, including color image grayscale, document image block, improvement of local contrast of image block, feature parameter extraction, SVM threshold classification, image block splicing, stroke Eight steps such as width estimation and local binarization; the present invention uses the minimum mean method to grayscale the color image, and the resulting grayscale image has color independence; the defined local contrast can not only compensate for the influence of image brightness changes, The normalized contribution of all pixels in the image neighborhood to the local contrast of the image is also comprehensively considered; the SVM threshold classification method is used for high accuracy and reliability; the progressive scanning method is used for stroke width estimation, which has a relatively small impact on document image resolution changes. Good robustness; the invention can better preserve the stroke details of characters, and while effectively segmenting the foreground of characters, it can better suppress ink infiltration, page stains, textured backgrounds, and uneven illumination.

Description

technical field [0001] The invention belongs to the technical fields of digital image processing, pattern recognition and machine learning, and relates to a document image binarization method, in particular to a support vector machine (SVM)-based low-quality document image binarization method. Background technique [0002] Document Analysis and Recognition (DAR) technology has been widely used in printed characters and formula recognition, handwritten character recognition, document image segmentation, video subtitle extraction, text information retrieval and other fields, mainly including image acquisition, preprocessing, binarization, layout Analysis, character recognition, indexing, etc. Image binarization is one of the key processing steps, which directly affects the performance of the DAR system. However, binarization of such low-quality document images is extremely challenging due to factors such as image contrast, ink smearing, page stains, or uneven lighting. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/38G06K9/62
CPCG06V10/28G06F18/214
Inventor 熊炜赵诗云徐晶晶赵楠刘敏王改华李敏刘小镜吴俊驰
Owner HUBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products