Image document layout recognition method, device and system
A recognition method and document technology, applied in the field of image and text recognition, can solve problems such as poor generalization ability, lower recognition accuracy, and many modules involved, and achieve the effect of improving accuracy, improving accuracy, and facilitating recognition work
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0079] refer to figure 2 , the first embodiment of the present invention provides an image document layout recognition method, including:
[0080] Step 100, acquiring the target image corresponding to the document to be processed;
[0081] As mentioned above, the document to be processed may be an image document file obtained through different image acquisition methods such as scanning, photographing, photocopying and reproduction. For example, it may be a scanned file, a picture document, and the like.
[0082] As mentioned above, the target image is one or more images corresponding to the document to be processed obtained by processing the document to be processed.
[0083] Further, the step 100, obtaining the target image corresponding to the document to be processed may also include:
[0084] Step 110, performing page splitting on the document to be processed to obtain split unit pages;
[0085] Step 120, grayscale processing the split unit page to obtain the target i...
Embodiment 2
[0117] refer to Figure 3-4 , the second embodiment of the present invention provides a method for recognizing the layout of an image document. Based on the above-mentioned embodiment 1, the step 200 performs instance segmentation on the target image, and obtains the target area divided by the instance segmentation including:
[0118] Step 210, performing semantic segmentation on the target image to obtain a layout analysis feature map of the target image;
[0119] It should be noted that, in semantic segmentation, a category is assigned to each pixel in the image during the recognition process, but objects of the same category will not be distinguished.
[0120] The layout analysis feature map is obtained after the target image X is semantically segmented for several parts after segmentation. A target image can contain multiple layout analysis feature maps, and each layout analysis feature map represents one of them. An identified feature region.
[0121] As mentioned above...
Embodiment 3
[0138] refer to Figure 5-6 , this embodiment provides a method for identifying the layout of an image document. Based on the above-mentioned embodiment 1, the step 400, determining the target area where the text box corresponds includes:
[0139] Step 410, calculating the coordinates of the center point corresponding to the text box;
[0140] The position of each text box is calculated, and the coordinates of the center point of the text box can be obtained according to the coordinates of the four vertices of its rectangle.
[0141] As mentioned above, the text box is a box containing text content defined in accordance with writing habits, and the coordinates of the center point are the coordinates at the center of the text box that can be calculated using the coordinates of the four vertices of the text box.
[0142] Further, the step 410, calculating the coordinates of the center point corresponding to the text box includes:
[0143] Step 411, obtain the vertex coordinate...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


