Unlock instant, AI-driven research and patent intelligence for your innovation.

Self-adaption binaryzation method for document images and equipment

A document image, adaptive technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problem of weak stroke characters and light changing background effects, loss of weak strokes, not suitable for processing characters with table lines String images, etc.

Inactive Publication Date: 2013-03-20
FUJITSU LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Global binarization methods such as Otsu's method do not work well for degenerated weak-stroke characters and backgrounds with varying lighting
Adaptive binarization methods such as the Niblack method and Sauvola method can cope with the above situations, but often generate a lot of noise in the background image
Sauvola's method, which is a modification of Niblack's method, works better for textured background images, but may lose weak strokes
In addition, none of the current binarization methods are suitable for processing string images with tabular lines

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-adaption binaryzation method for document images and equipment
  • Self-adaption binaryzation method for document images and equipment
  • Self-adaption binaryzation method for document images and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the present invention will be described below with reference to the drawings. It should be noted that representation and description of components and processes that are not related to the present invention and known to those of ordinary skill in the art are omitted from the drawings and descriptions for the purpose of clarity.

[0023] figure 1 is a block diagram showing an adaptive binarization device 100 for a document image according to an embodiment of the present invention.

[0024] Such as figure 1 As shown, the adaptive binarization device 100 includes an estimator 101 , a calculator 102 and an extractor 103 . exist figure 1 In the illustrated embodiment, it is assumed that lines extending in the row direction are distributed in the document image, and handwritten or printed characters exist in the row direction. Also, it is assumed that the gray scale of the document image is 0 to 255. Grayscale 255 represents pure white, grayscale 0 represe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a self-adaption binaryzation method for document images and equipment; the self-adaption binaryzation equipment comprises an estimator which is used for estimating a first background gray level of each pixel in the group to obtain a first background image of the document image according to each group of the pixels of the document image along one direction of line and row directions; a calculator which is used for calculating the average distance r from all the pixels in a first pixel area by taking the pixel as a center to the first background image according to the each pixel of the document image, and calculating the difference of the first background gray level of the pixel and the d positively correlated with the average distance r, so as to take the difference to be used as a first threshold value of the pixel; and an extractor which is used for extracting a first binaryzation image from the document image according to the first threshold value, wherein, the estimation size of the first background gray level based on the pixel interval and the first pixel area is more than the pre-determined size, so as to remove the lines in the document image along the horizontal and vertical directions and be beneficial to improving the character recognition rate.

Description

technical field [0001] The invention relates to image binarization technology, in particular to an adaptive binarization method and device for document images. Background technique [0002] In recent years, with the rapid development of image processing technology, document image optical character recognition (OCR) is being widely used. As an image preprocessing technique, document image binarization is often used in OCR systems. Binarization is the process of converting a color or grayscale image into a black and white image, where the black and white image has only two grayscales, black and white. [0003] There are many global or adaptive binarization methods for document images. Examples of binarization methods include Otsu's method for calculating thresholds from gray-level histograms (see "A ThresholdSelection Method from Gray-Level Histograms", IEEE Trans. On systems, Man, 30 and cybernetics, Vol. SMC-9, No.1, pp.62-66, January 1979), the Niblack method for computi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/38
Inventor 郑大念孙俊直井聪堀田悦伸皆川明洋藤本克仁
Owner FUJITSU LTD