Method for judging typesetting directions of text regions

A text area and histogram technology, which is applied in the direction of instruments, electrical digital data processing, character and pattern recognition, etc., can solve the problems of poor judgment accuracy, achieve fast speed, simple and fast statistical methods, and good application value

Active Publication Date: 2010-11-10
HANVON CORP
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method is simple and fast, but for text areas with tilt or (and) slight geometric distortion (such as geometric distortion in the image captured by the camera), the projection histogram loses the above-mentioned obvious characteristics, and the judgment accuracy is very poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for judging typesetting directions of text regions
  • Method for judging typesetting directions of text regions
  • Method for judging typesetting directions of text regions

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to understand the technical content of the present invention more clearly, the following examples are used for detailed description.

[0047] Before implementation, the document image, whether it is a color or grayscale image, is first converted into a binary image by binarization, and the connected domain in the image is obtained by using the connected domain marking algorithm, and the large connected domain such as the image table is removed , for the remaining connected domains, the adjacent connected domains are merged to obtain regions one by one. These regions are called text regions, and each text region is represented by a bounding rectangle. This method judges the direction of text typesetting for these text regions composed of connected domains.

[0048] Such as figure 2 Shown embodiment image, its processing procedure comprises the following steps, as figure 1 Shown:

[0049] Step 10: Calculate the character height of the text area. Calculate th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for judging typesetting directions of text regions, belonging to the OCR field. The method is characterized by carrying out statistic analysis according to the obtained projective histogram, finding out respective most representative characteristic data triples and judging the typesetting directions of the text regions with less than three character lines accordingto the length-width ratios of the external rectangles of the text regions; judging the typesetting directions of the text regions with not less than three character lines according to the number and statistical positions of abnormal projecting cylinders; judging the text typesetting directions which can not be judged by the above methods according to the first moment between normal projecting cylinders; judging the typesetting directions which can not be judged by the above methods according to the indent of the text characters; and giving up judging in the regions of which the typesetting directions still can not be judged at least. The method can accurately judge whether the normal text regions are horizontal or vertical and judge whether the text regions with low inclination angles or slight geometric distortion are horizontal or vertical and has good judgment effect, high speed and good application value.

Description

technical field [0001] The invention belongs to the technical field of OCR (Optical Character Recognition), and in particular relates to a method for judging the typesetting direction of a text area. Background technique [0002] The main forms of current information are paper and electronic media. With the development and popularization of information technology and computer technology, paper media lags behind electronic media in many aspects such as storage cost, recording density, sharing means, and convenience of reference. In order to convert information from paper media to electronic media storage, the general method is to scan or photograph paper documents (including paper books, magazines, newspapers, documents, etc.) The results are processed separately, such as compressing and storing images, and performing OCR recognition on text, etc. [0003] Layout analysis is the process of automatically segmenting and recognizing images, tables, and texts in document images...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/20G06F40/189
Inventor 李永彬
Owner HANVON CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products