Unlock instant, AI-driven research and patent intelligence for your innovation.

Image document layout recognition method, device and system

A recognition method and document technology, applied in the field of image and text recognition, can solve problems such as poor generalization ability, lower recognition accuracy, and many modules involved, and achieve the effect of improving accuracy, improving accuracy, and facilitating recognition work

Active Publication Date: 2022-03-11
CHENGDU UNION BIG DATA TECH CO LTD
View PDF17 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, in the prior art, some use classification models for layout analysis on the basis of text detection, but they can only solve the problem of single-line text classification. For paragraphs and tables, where an instance contains multiple rows and columns of text, the effect Poor; some technologies use the fusion of image features and semantic features for layout analysis, but involve many modules, including image feature extraction, semantic feature extraction, feature fusion and reclassification, slow performance, and involve the extraction of semantic features. A large amount of data is required as a support, otherwise the generalization ability is poor; some clustering is carried out through the location information of the label area and the text block area. When the paragraph spacing and the text line spacing are the same, the distance-based method of clustering To perform calculations, it is easy to identify multiple adjacent paragraphs as the same paragraph, resulting in reduced recognition accuracy
[0004] In short, the existing layout recognition methods can only solve the classification and recognition of single-line and single-column text, involving many modules, complex programs, poor generalization ability, and the fact that the spacing between paragraphs and text lines is the same, so that adjacent paragraphs are recognized as the same The problem of one paragraph, the recognition accuracy is low, and it cannot adapt to the recognition of document layouts with differences in various images, which brings great inconvenience to the analysis and processing of document layouts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image document layout recognition method, device and system
  • Image document layout recognition method, device and system
  • Image document layout recognition method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0079] refer to figure 2 , the first embodiment of the present invention provides an image document layout recognition method, including:

[0080] Step 100, acquiring the target image corresponding to the document to be processed;

[0081] As mentioned above, the document to be processed may be an image document file obtained through different image acquisition methods such as scanning, photographing, photocopying and reproduction. For example, it may be a scanned file, a picture document, and the like.

[0082] As mentioned above, the target image is one or more images corresponding to the document to be processed obtained by processing the document to be processed.

[0083] Further, the step 100, obtaining the target image corresponding to the document to be processed may also include:

[0084] Step 110, performing page splitting on the document to be processed to obtain split unit pages;

[0085] Step 120, grayscale processing the split unit page to obtain the target i...

Embodiment 2

[0117] refer to Figure 3-4 , the second embodiment of the present invention provides a method for recognizing the layout of an image document. Based on the above-mentioned embodiment 1, the step 200 performs instance segmentation on the target image, and obtains the target area divided by the instance segmentation including:

[0118] Step 210, performing semantic segmentation on the target image to obtain a layout analysis feature map of the target image;

[0119] It should be noted that, in semantic segmentation, a category is assigned to each pixel in the image during the recognition process, but objects of the same category will not be distinguished.

[0120] The layout analysis feature map is obtained after the target image X is semantically segmented for several parts after segmentation. A target image can contain multiple layout analysis feature maps, and each layout analysis feature map represents one of them. An identified feature region.

[0121] As mentioned above...

Embodiment 3

[0138] refer to Figure 5-6 , this embodiment provides a method for identifying the layout of an image document. Based on the above-mentioned embodiment 1, the step 400, determining the target area where the text box corresponds includes:

[0139] Step 410, calculating the coordinates of the center point corresponding to the text box;

[0140] The position of each text box is calculated, and the coordinates of the center point of the text box can be obtained according to the coordinates of the four vertices of its rectangle.

[0141] As mentioned above, the text box is a box containing text content defined in accordance with writing habits, and the coordinates of the center point are the coordinates at the center of the text box that can be calculated using the coordinates of the four vertices of the text box.

[0142] Further, the step 410, calculating the coordinates of the center point corresponding to the text box includes:

[0143] Step 411, obtain the vertex coordinate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an image document layout recognition method, device and system. According to the method, the recognition accuracy of each sub-module of a region which is difficult to divide is improved, paragraph groups with small paragraph intervals can be split and recognized, the paragraph recognition accuracy is greatly improved, and convenience is provided for recognition work of an image document layout.

Description

technical field [0001] The present invention relates to the technical field of image character recognition, and more specifically, to a method, device and system for image document layout recognition. Background technique [0002] Text is an essential information carrier for communication in human society, and it exists in large numbers in social life and on the Internet. With the development of the times, people have realized the complexity of dissemination and sharing of paper documents, so they began to use fax, scanning and other equipment to convert paper documents into electronic documents, create electronic documents on computers and mobile devices, and make these electronic documents With the continuous development of network technology, its dissemination and sharing have also developed rapidly. Due to the needs of storage, document reprocessing, editing, management, etc., image processing and analysis of electronic documents stored in the form of images, that is, d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/26G06V30/148G06V30/40G06T5/30G06T7/66
CPCG06T5/30G06T7/66
Inventor 不公告发明人
Owner CHENGDU UNION BIG DATA TECH CO LTD