Image document layout recognition method, device and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A recognition method and document technology, applied in the field of image and text recognition, can solve problems such as poor generalization ability, lower recognition accuracy, and many modules involved, and achieve the effect of improving accuracy, improving accuracy, and facilitating recognition work

Active Publication Date: 2022-03-11

CHENGDU UNION BIG DATA TECH CO LTD

View PDF17 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

For example, in the prior art, some use classification models for layout analysis on the basis of text detection, but they can only solve the problem of single-line text classification. For paragraphs and tables, where an instance contains multiple rows and columns of text, the effect Poor; some technologies use the fusion of image features and semantic features for layout analysis, but involve many modules, including image feature extraction, semantic feature extraction, feature fusion and reclassification, slow performance, and involve the extraction of semantic features. A large amount of data is required as a support, otherwise the generalization ability is poor; some clustering is carried out through the location information of the label area and the text block area. When the paragraph spacing and the text line spacing are the same, the distance-based method of clustering To perform calculations, it is easy to identify multiple adjacent paragraphs as the same paragraph, resulting in reduced recognition accuracy

[0004] In short, the existing layout recognition methods can only solve the classification and recognition of single-line and single-column text, involving many modules, complex programs, poor generalization ability, and the fact that the spacing between paragraphs and text lines is the same, so that adjacent paragraphs are recognized as the same The problem of one paragraph, the recognition accuracy is low, and it cannot adapt to the recognition of document layouts with differences in various images, which brings great inconvenience to the analysis and processing of document layouts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0079] refer to figure 2 , the first embodiment of the present invention provides an image document layout recognition method, including:

[0080] Step 100, acquiring the target image corresponding to the document to be processed;

[0081] As mentioned above, the document to be processed may be an image document file obtained through different image acquisition methods such as scanning, photographing, photocopying and reproduction. For example, it may be a scanned file, a picture document, and the like.

[0082] As mentioned above, the target image is one or more images corresponding to the document to be processed obtained by processing the document to be processed.

[0083] Further, the step 100, obtaining the target image corresponding to the document to be processed may also include:

[0084] Step 110, performing page splitting on the document to be processed to obtain split unit pages;

[0085] Step 120, grayscale processing the split unit page to obtain the target i...

Embodiment 2

[0117] refer to Figure 3-4 , the second embodiment of the present invention provides a method for recognizing the layout of an image document. Based on the above-mentioned embodiment 1, the step 200 performs instance segmentation on the target image, and obtains the target area divided by the instance segmentation including:

[0118] Step 210, performing semantic segmentation on the target image to obtain a layout analysis feature map of the target image;

[0119] It should be noted that, in semantic segmentation, a category is assigned to each pixel in the image during the recognition process, but objects of the same category will not be distinguished.

[0120] The layout analysis feature map is obtained after the target image X is semantically segmented for several parts after segmentation. A target image can contain multiple layout analysis feature maps, and each layout analysis feature map represents one of them. An identified feature region.

[0121] As mentioned above...

Embodiment 3

[0138] refer to Figure 5-6 , this embodiment provides a method for identifying the layout of an image document. Based on the above-mentioned embodiment 1, the step 400, determining the target area where the text box corresponds includes:

[0139] Step 410, calculating the coordinates of the center point corresponding to the text box;

[0140] The position of each text box is calculated, and the coordinates of the center point of the text box can be obtained according to the coordinates of the four vertices of its rectangle.

[0141] As mentioned above, the text box is a box containing text content defined in accordance with writing habits, and the coordinates of the center point are the coordinates at the center of the text box that can be calculated using the coordinates of the four vertices of the text box.

[0142] Further, the step 410, calculating the coordinates of the center point corresponding to the text box includes:

[0143] Step 411, obtain the vertex coordinate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides an image document layout recognition method, device and system. According to the method, the recognition accuracy of each sub-module of a region which is difficult to divide is improved, paragraph groups with small paragraph intervals can be split and recognized, the paragraph recognition accuracy is greatly improved, and convenience is provided for recognition work of an image document layout.

Description

technical field [0001] The present invention relates to the technical field of image character recognition, and more specifically, to a method, device and system for image document layout recognition. Background technique [0002] Text is an essential information carrier for communication in human society, and it exists in large numbers in social life and on the Internet. With the development of the times, people have realized the complexity of dissemination and sharing of paper documents, so they began to use fax, scanning and other equipment to convert paper documents into electronic documents, create electronic documents on computers and mobile devices, and make these electronic documents With the continuous development of network technology, its dissemination and sharing have also developed rapidly. Due to the needs of storage, document reprocessing, editing, management, etc., image processing and analysis of electronic documents stored in the form of images, that is, d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/26G06V30/148G06V30/40G06T5/30G06T7/66

CPCG06T5/30G06T7/66

Inventor 不公告发明人

Owner CHENGDU UNION BIG DATA TECH CO LTD

Image document layout recognition method, device and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology