Graph, table and text mixed layout analysis system and method combining threshold value and projection method

A technology of layout analysis and projection method, applied in the field of text recognition, can solve the problem of low overall text recognition rate

Pending Publication Date: 2019-11-29
WENZHOU UNIVERSITY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, in text recognition, it is difficult for a computer to recognize whether the page is upright, upside down, or mirrored, and the overall text recognition rate is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Graph, table and text mixed layout analysis system and method combining threshold value and projection method
  • Graph, table and text mixed layout analysis system and method combining threshold value and projection method
  • Graph, table and text mixed layout analysis system and method combining threshold value and projection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0102] The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

[0103] Such as figure 1 As shown, the embodiment of the present invention provides a combined layout analysis method for graphs, tables, and texts combined with threshold and projection methods, including the following steps:

[0104] 1. Correct the image

[0105] Input: scanned image rgbImage, number of sampling points numSamplesEachEdge on each side;

[0106] Output: rectified image rectifiedImage;

[0107] (1) Find the closed area R in the image that occupies the center, has smooth edges, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a graph, table and text mixed layout analysis method combining a threshold value and a projection method. The method comprises the following steps: S1, converting a corrected grayscale image R'into a binary (black and white) image according to a threshold value Tg; S2, dividing each foreground area in the binary image into a character area and a non-character area; S3, analyzing the table sub-graph into a table; S4, segmenting each table into rows / columns; S5, detecting whether the page layout of the whole image is correct or not; S6, sorting the columns / rows to determine the processing sequence of the next step, wherein the columns are sorted before the rows are sorted; S7, for each column / row of the table, performing column-first and row-second operation, only processing pure columns / rows, and sequentially segmenting the columns / rows into characters; and S8, processing the compounded columns / rows according to the characters, and sequentially segmenting the columns / rows into characters. According to the method, the layout of the page can be analyzed to determine whether the page is reversed in front, back, up-down and left-right directions, characters adhered to the table can be processed, and the recognition rate of the characters can be greatly improved.

Description

technical field [0001] The invention relates to the technical field of character recognition, in particular to a layout analysis system and method for mixed typesetting of graphs, tables, and texts combined with a threshold value and a projection method. Background technique [0002] At present, in text recognition, it is difficult for a computer to recognize whether the page is upright, upside down, or mirrored, and the overall text recognition rate is not high. Contents of the invention [0003] In order to solve the above problems, the present invention provides a system and method for analyzing layouts of figures, tables, and texts combined with threshold and projection methods, which can analyze the layout of the layout, determine whether the page is the front, the back, whether the up and down, left and right are reversed, and can also Dealing with the characters stuck on the table can also greatly improve the recognition rate of the characters. [0004] In order to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06F17/25
CPCG06V30/43G06V30/40
Inventor 罗胜
Owner WENZHOU UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products