Invoice image edge detection method, device and equipment and computer storage medium

By extracting and regressing feature points from document image samples, and training and adjusting a preset edge detection model, the accuracy problem of document image edge detection in complex scenarios is solved, achieving high-precision document edge recognition.

CN115705737BActive Publication Date: 2026-06-19SF TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SF TECH CO LTD
Filing Date
2021-08-11
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing document image edge detection algorithms struggle to accurately detect the edges of multiple document images in complex scenarios, especially when document images overlap or when the background and document edges are highly similar.

Method used

Feature points are extracted from document image samples to obtain vertex information and line segment features. Edge detection results are obtained through regression processing. A preset edge detection model is then used for training and adjustment to improve detection accuracy.

Benefits of technology

It achieves accurate edge detection of document images in complex scenarios, improving the accuracy of edge recognition and the versatility of the detection model.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115705737B_ABST
    Figure CN115705737B_ABST
Patent Text Reader

Abstract

This application provides a method, apparatus, device, and computer storage medium for edge detection of document images. The edge detection method for document images in this application includes: acquiring a target document image to be detected; extracting feature points from the target document image to obtain vertex information; fusing the vertex information to obtain vertex features and line segment features; and performing regression processing on the vertex features and line segment features to obtain an edge detection result. In this embodiment, the target document image is processed to obtain vertex features and line segment features, and the edge information of the target document image is determined based on the vertex features and line segment features. This improves the accuracy of edge recognition, enables edge detection of document images in complex scenarios, and ensures the versatility of document edge detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image recognition technology, specifically to a method, apparatus, device, and computer storage medium for edge detection of document images. Background Technology

[0002] With the rapid development of society, image recognition scenarios are becoming more and more common, and the requirements for the accuracy of image edge detection are also increasing.

[0003] Deep learning-based edge detection for document images is relatively mature, but the accuracy of edge detection in complex scenarios is still not guaranteed. For rectangular document images, when multiple document images overlap, it is difficult to detect the document edges in each document image individually due to the high similarity of the document image edges. Or, in a single document image, when the content in the document image is outside the document edge, it is difficult to accurately detect the document edge in the document image.

[0004] In other words, the existing document images are complex and varied, with anomalies such as multiple overlapping documents, high similarity between the background and document edges, and incomplete documents being quite common. Traditional deep learning-based document image edge detection algorithms cannot effectively detect document images. Summary of the Invention

[0005] This application provides a method, apparatus, device, and computer storage medium for edge detection of document images, aiming to solve the technical problem that the edges of document images cannot be accurately detected in complex scenarios.

[0006] On the one hand, this application provides an edge detection method for document images, the edge detection method for document images including:

[0007] Acquire the image of the target document to be inspected;

[0008] Feature points are extracted from the target document image to obtain the vertex information of the target document image;

[0009] The vertex information is fused to obtain vertex features and line segment features;

[0010] Regression processing is performed on the vertex features and the line segment features to obtain the edge detection results.

[0011] In some embodiments of this application, the edge detection method for the document image is applied to a preset edge detection model;

[0012] Before extracting feature points from the target document image to obtain the vertex information of the target document image, the method includes:

[0013] Obtain document image samples, and extract features from the preprocessed document image samples to obtain vertex features and line segment features;

[0014] The preset initial detection model is trained using the vertex features and the line segment features to obtain feature point loss values ​​and feature line segment loss values. The preset initial detection model refers to a pre-set recognition algorithm.

[0015] Based on the feature point loss value and the feature line segment loss value, determine whether the obtained training detection model has reached the training endpoint;

[0016] If the training detection model does not reach the training endpoint, the parameters of the training detection model are adjusted until a preset edge detection model is obtained.

[0017] In some embodiments of this application, after obtaining the document image sample, the process includes:

[0018] The document image samples are filtered to obtain denoised document image samples;

[0019] The denoised document image sample is subjected to grayscale processing and edge extraction to obtain a grayscale document image sample with edge information.

[0020] The denoised document image sample and the grayscale document image sample are fused together to obtain the preprocessed document image sample.

[0021] In some embodiments of this application, the step of obtaining document image samples and extracting features from the preprocessed document image samples to obtain vertex features and line segment features includes:

[0022] Obtain document image samples, and extract features from the preprocessed document image samples using a preset initial detection model to obtain labeled vertex information;

[0023] The vertex features are obtained by integrating the vertex coordinates and vertex pixel values ​​in the vertex information using the initial detection model.

[0024] The vertex information is regressed using the initial detection model to obtain line segment features.

[0025] In some embodiments of this application, determining whether the obtained trained detection model has reached the training endpoint based on the feature point loss value and the feature line segment loss value includes:

[0026] The mean squared error loss value is obtained by statistically analyzing the feature point loss value and the feature line segment loss value of each document image sample.

[0027] The mean squared error loss value is compared with a preset loss threshold.

[0028] If the mean squared error loss value is less than the preset loss threshold, then the obtained training detection model is determined to have reached the training endpoint.

[0029] If the mean squared error loss value is greater than or equal to the preset loss threshold, it is determined that the obtained training detection model has not reached the training endpoint.

[0030] In some embodiments of this application, the step of fusing the vertex information to obtain vertex features and line segment features includes:

[0031] The vertex features are obtained by integrating the vertex coordinates and vertex pixel values ​​in the vertex information using the preset edge detection model.

[0032] The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features.

[0033] In some embodiments of this application, the step of processing the vertex information using the preset edge detection model to form a detection rectangle, and processing the target document image based on the midpoints of each side of the detection rectangle to obtain line segment features includes:

[0034] The vertex information is processed by the preset edge detection model to obtain the number of vertices. Based on the vertex information and the number of vertices, a first detection rectangle is generated.

[0035] The vertex information is fused with the preset edge detection model to generate several detection boxes. Based on the midline connection of each detection box, four detection regions are formed corresponding to each detection box. Based on the corner points of the four detection regions corresponding to each detection box, a second detection rectangle is determined.

[0036] By fusing the first detection rectangle and the second detection rectangle, line segment features are obtained.

[0037] In some embodiments of this application, after performing regression processing on the vertex features and the line segment features to obtain the edge detection result, the method includes:

[0038] The target document image is divided into regions based on the edge detection results to obtain the document internal region of the target document image;

[0039] The document content information is obtained by recognizing characters in the internal area of ​​the document using a preset character recognition model.

[0040] Extract the document identifier from the document content information, and associate and save the document identifier with the document content information.

[0041] On the other hand, this application provides an edge detection device for a document image, the edge detection device for the document image comprising:

[0042] The acquisition module is used to acquire the image of the target document to be detected;

[0043] The input module is used to extract feature points from the target document image to obtain the vertex information of the target document image;

[0044] The processing module is used to fuse the vertex information to obtain vertex features and line segment features;

[0045] The detection module is used to perform regression processing on the vertex features and the line segment features to obtain edge detection results.

[0046] On the other hand, this application also provides an edge detection device for a document image, the edge detection device for the document image comprising:

[0047] One or more processors;

[0048] Memory; and

[0049] One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the edge detection method for the document image.

[0050] On the other hand, this application also provides a computer storage medium storing a computer program that is loaded by a processor to execute the steps in the document image edge detection method.

[0051] In the technical solution of this application, the target document image is processed to obtain vertex information. Vertex features and line segment features are obtained based on the vertex information. The edge information of the target document image is determined based on the vertex features and line segment features. By processing the vertex features and line segment features, regression of vertices and line segments is performed to improve the detection accuracy of target edges in the target document image. This enables document edge detection in document images in complex scenarios, ensures the universality of the preset edge detection model for document edge detection, and improves the accuracy of edge recognition. Attached Figure Description

[0052] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0053] Figure 1 This is a schematic diagram of a scenario for the edge detection method of document images provided in the embodiments of this application;

[0054] Figure 2 This is a schematic diagram of an embodiment of the edge detection model construction in the edge detection method for document images in this application.

[0055] Figure 3 This is a schematic flowchart of an embodiment of the document image sample preprocessing in the document image edge detection method provided in this application;

[0056] Figure 4 This is a schematic flowchart of an embodiment of the edge detection method for document images provided in this application.

[0057] Figure 5 This is a schematic diagram of a scenario for edge detection of a document image in the edge detection method for document images provided in the embodiments of this application;

[0058] Figure 6 This is a schematic flowchart of an embodiment of the document content recognition in a target document image in the edge detection method for document images provided in this application.

[0059] Figure 7 This is a schematic diagram of an embodiment of the edge detection device for document images involved in this invention;

[0060] Figure 8 This is a schematic diagram of the structure of the edge detection device for document images involved in the embodiments of the present invention. Detailed Implementation

[0061] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of the present invention.

[0062] In the description of this invention, it should be understood that the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, features defined as "first" and "second" may explicitly or implicitly include one or more of the stated features. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.

[0063] In this application, the term "exemplary" is used to mean "serving as an example, illustration, or description." Any embodiment described as "exemplary" in this application is not necessarily to be construed as being more preferred or advantageous than other embodiments. The following description is provided to enable any person skilled in the art to make and use the invention. Details are set forth in the following description for purposes of explanation. It should be understood that those skilled in the art will recognize that the invention can be made without using these specific details. In other instances, well-known structures and processes will not be described in detail to avoid obscuring the description of the invention with unnecessary detail. Therefore, the invention is not intended to be limited to the embodiments shown, but is consistent with the broadest scope of the principles and features disclosed in this application.

[0064] This application provides a method, apparatus, device, and computer storage medium for edge detection of document images, which will be described in detail below.

[0065] The edge detection method for document images in this embodiment of the invention is applied to an edge detection device for document images. The edge detection device for document images is set in an edge detection equipment for document images. The edge detection equipment for document images includes one or more processors, a memory, and one or more application programs. The one or more application programs are stored in the memory and configured to be executed by the processor to implement the edge detection method for document images. The edge detection equipment for document images can be a server or a terminal, such as a mobile phone, a tablet computer, or a camera.

[0066] like Figure 1 As shown, Figure 1 This is a schematic diagram of a scenario for the edge detection method of a document image according to an embodiment of this application. The edge detection scenario of the document image in this embodiment includes an edge detection device 100 for the document image (the edge detection device 100 for the document image integrates an edge detection apparatus for the document image). The edge detection device 100 for the document image runs on a computer storage medium corresponding to the edge detection of the document image to execute the steps of the edge detection method for the document image.

[0067] Understandable Figure 1The edge detection device for the document image in the scenario shown, or the device included in the edge detection device for the document image, does not constitute a limitation on the embodiments of the present invention. That is, the number or type of device included in the scenario of edge detection for the document image, or the number or type of device included in each device, does not affect the overall implementation of the technical solution in the embodiments of the present invention, and can all be considered as equivalent substitutions or derivatives of the technical solutions claimed in the embodiments of the present invention.

[0068] In this embodiment of the invention, the edge detection device 100 for document images is mainly used for: acquiring a target document image to be detected; extracting feature points from the target document image to obtain vertex information of the target document image; fusing the vertex information to obtain vertex features and line segment features; and performing regression processing on the vertex features and the line segment features to obtain edge detection results.

[0069] In this embodiment of the invention, the edge detection device 100 for document images can be an independent edge detection device for document images, or it can be a network or cluster of edge detection devices for document images. For example, the edge detection device 100 for document images described in this embodiment of the invention includes, but is not limited to, computers, network hosts, single network document image edge detection devices, sets of multiple network document image edge detection devices, or cloud document image edge detection devices composed of multiple document image edge detection devices. The cloud document image edge detection device is composed of a large number of computer or network document image edge detection devices based on cloud computing.

[0070] Those skilled in the art will understand that Figure 1 The application environment shown is merely one application scenario of the solution in this application and does not constitute a limitation on the application scenario of the solution in this application. Other application environments may include those that are more specific to this application. Figure 1 The number of edge detection devices for more or fewer document images shown, or the network connectivity of edge detection devices for document images, for example... Figure 1 The image shows only one edge detection device for a document image. It is understood that the edge detection scenario for this document image may also include one or more edge detection devices for other document images, which is not limited here. The edge detection device 100 for this document image may also include a memory for storing data, such as storing photographs of the document obtained by shooting.

[0071] Furthermore, in the scenario of edge detection of document images in this application, the edge detection device 100 for document images can be equipped with a display device, or the edge detection device 100 for document images can be connected to an external display device 200 without a display device. The display device 200 is used to output the results of the edge detection method for document images executed in the edge detection device. The edge detection device 100 for document images can access a background database 300 (the background database can be located in the local storage of the edge detection device for document images, or it can be located in the cloud). The background database 300 stores information related to the edge detection of document images, such as document photos.

[0072] It should be noted that, Figure 1 The schematic diagram of the edge detection method for document images shown is merely an example. The edge detection scenario of document images described in this embodiment of the invention is intended to more clearly illustrate the technical solution of this embodiment and does not constitute a limitation on the technical solution provided by this embodiment of the invention.

[0073] In this embodiment, the edge detection method for document images is applied to an edge detection device for document images. The type of edge detection device for document images is not specifically limited. That is, the edge detection device for document images can be a terminal or a server. In this embodiment, the edge detection device has a preset edge detection model, which is a deep neural network model used to identify the edges of documents.

[0074] like Figure 2 As shown, Figure 2 This is a schematic diagram of an embodiment of the edge detection method for document images in this application, illustrating the construction of a preset edge detection model.

[0075] In this embodiment, the edge detection model construction in the document image edge detection method includes steps 201-204:

[0076] 201. Obtain document image samples, extract features from the preprocessed document image samples, and obtain vertex features and line segment features.

[0077] The edge detection device receives model building instructions. The triggering method of the model building instructions is not specifically limited. That is, the model building instructions can be triggered actively by the user. For example, the user can enter "training" in the display interface of the edge detection device to trigger the model building instructions. In addition, the model building instructions can also be triggered automatically by the edge detection device. For example, when the number of preset document image samples in the edge detection device exceeds 10,000, the model building instructions are automatically triggered.

[0078] After receiving the model building instruction, the edge detection device acquires a massive number of document image samples. Document image samples refer to document images used to train the edge detection model. The format and quantity of document image samples are not specifically limited, nor are the types of documents in the document image samples.

[0079] The edge detection device marks the vertex and edge information of document image samples according to predefined document edge detection rules. That is, the predefined document edge recognition rules include: the document edge distance to the image border rule, the document vertex position coordinate range, and the document edge straight line position relationship. The edge detection device marks each document image sample according to the document edge recognition rules to obtain the marked document image sample.

[0080] The edge detection device preprocesses the labeled document image samples. The preprocessing method is not limited. The main purpose of the edge detection device in preprocessing the document image samples is to eliminate irrelevant information in the image, restore useful real information, enhance the detectability of relevant information, simplify the data to the maximum extent, and thus improve the reliability of feature extraction, image segmentation, matching and recognition.

[0081] In this embodiment, the preprocessing procedure for the edge detection device is: grayscale conversion, geometric transformation, and image enhancement; wherein:

[0082] (1) Grayscale conversion is to process the three channels of a color image in sequence. That is, the time cost of directly processing the color image will be very large. In order to improve the processing speed of the entire application system, the color image needs to be grayscaled to reduce the amount of data to be processed.

[0083] (2) Geometric transformation, also known as image space transformation, processes the acquired image through geometric transformations such as translation, transpose, mirroring, rotation, and scaling. Geometric transformation is used to correct the systematic errors of the image acquisition system and the random errors of the instrument position (imaging angle, perspective relationship, and even lens itself).

[0084] (3) Image enhancement refers to enhancing the useful information in an image. Image enhancement can be a distortion process. Its purpose is to improve the visual effect of the image. For a given image application scenario, it aims to emphasize the overall or local characteristics of the image, make the originally unclear image clear or emphasize certain features of interest, expand the differences between the features of different objects in the image, suppress features of no interest, improve the image quality, enrich the amount of information, enhance the image interpretation and recognition effect, and meet the needs of certain special analyses.

[0085] In this embodiment, the preprocessing of the document image samples is to remove noise from the document image samples, so that the features extracted from the document image samples are more accurate.

[0086] The edge detection device inputs the preprocessed document image samples into a preset initial detection model. The preset initial detection model refers to the initial algorithm for image feature extraction. Features are extracted from the preprocessed document image samples using the preset initial detection model to obtain vertex features and line segment features. In this embodiment, vertex features and line segment features are extracted and fused based on the vertex features and line segment features to make the trained model more accurate in edge detection. Specifically, this includes:

[0087] (1) The preprocessed document image sample is subjected to feature extraction by a preset initial detection model to obtain the labeled vertex information;

[0088] (2) The vertex coordinates and vertex pixel values ​​in the vertex information are integrated through the initial detection model to obtain vertex features;

[0089] (3) The vertex information is regressed using the initial detection model to obtain line segment features.

[0090] The edge detection device extracts features from the preprocessed document image samples using a preset initial detection model to obtain labeled vertex information. The edge detection module integrates the vertex coordinates and vertex pixel values ​​in the vertex information using the initial detection model to obtain vertex features, which can be a vertex-related pixel matrix. The edge detection device performs regression processing on the vertex information using the initial detection model to obtain line segment features.

[0091] For example, in this embodiment, based on the original edge detection algorithm, regression of the target vertex feature map and the line segment feature map of the line segments forming the target vertices is added. Specifically, a pure black background image with the same size as the original image is generated, with all pixel values ​​being 0. Based on this, the vertex feature map used for regression is generated: according to the vertex coordinates labeled, corresponding pure white pixels are drawn on the background image with a pixel value of 255. Then, Gaussian filtering is applied to the generated image. The purpose of filtering is twofold: first, to magnify the target, as a single pixel is too small for training to converge; and second, to make the transition between the target region and non-target regions smoother. The line segment feature map: similar to the vertex feature map, the region is drawn as a line segment connecting four points, and Gaussian filtering is applied to this line segment for smoothing.

[0092] 202. The preset initial detection model is trained using the vertex features and the line segment features to obtain the feature point loss value and the feature line segment loss value. The preset initial detection model refers to a pre-set edge recognition algorithm.

[0093] The edge detection device trains the initial detection model using vertex features and line segment features. That is, the edge detection device inputs vertex features and line segment features into the initial detection model, and the initial detection model processes the vertex features and line segment features to obtain feature point loss values ​​and feature line segment loss values.

[0094] 203. Based on the feature point loss value and the feature line segment loss value, determine whether the obtained training detection model has reached the training endpoint.

[0095] The edge detection device integrates the feature point loss value and the feature line segment loss value to obtain a comprehensive loss value. The edge retrieval model uses this comprehensive loss value to determine whether the trained detection model has reached the training endpoint. Specifically, step 203 includes:

[0096] (1) Calculate the feature point loss value and the feature line segment loss value of each document image sample to obtain the mean square error loss value;

[0097] (2) Compare the mean squared error loss value with a preset loss threshold;

[0098] (3) If the mean squared error loss value is less than the preset loss threshold, it is determined that the obtained training detection model has reached the training endpoint.

[0099] (4) If the mean squared error loss value is greater than or equal to the preset loss threshold, it is determined that the obtained training detection model has not reached the training endpoint.

[0100] That is, the edge detection device counts the feature point loss value and feature line segment loss value of each document image sample, and calculates the mean squared error loss value; the edge detection device compares the mean squared error loss value with a preset loss threshold; wherein, the preset loss threshold can be set according to the specific scenario, for example, the preset loss threshold is 0.1; if the mean squared error loss value is less than the preset loss threshold, the edge detection device determines that the obtained training detection model has reached the training endpoint; if the mean squared error loss value is greater than or equal to the preset loss threshold, the edge detection device determines that the obtained training detection model has not reached the training endpoint.

[0101] 204. If the training detection model does not reach the training endpoint, the parameters of the training detection model are adjusted until a preset edge detection model is obtained.

[0102] If the training detection model does not reach the training endpoint, the edge detection device will adjust the parameters of the training detection model. That is, the edge detection device will iterate the training detection model with new document image samples until the training detection model reaches the training endpoint, and use the trained detection model as the preset edge detection model.

[0103] In this embodiment, the edge detection device performs comprehensive feature extraction on document image samples to obtain vertex features and line segment features. The initial detection model is trained using vertex features and line segment features, resulting in a more accurate edge detection model.

[0104] Reference Figure 3 , Figure 3 This is a schematic flowchart of an embodiment of the document image sample preprocessing in the edge detection method for document images provided in this application.

[0105] In some embodiments of this application, the edge detection device preprocesses document image samples including the following steps 301-303:

[0106] 301. Filter the document image sample to obtain a denoised document image sample.

[0107] The edge detection device performs Gaussian filtering on the marked document image samples to obtain denoised document image samples.

[0108] 302. Perform grayscale processing and edge extraction on the denoised document image sample to obtain a grayscale document image sample with edge information.

[0109] The edge detection device performs grayscale processing on the denoised document image samples. That is, since the pixel values ​​of the document edges in the document image samples are relatively uniform, the edge detection device performs grayscale processing on the document image samples to obtain grayscale document image samples. The grayscale document image samples effectively preserve the document edge information, making edge feature extraction more accurate.

[0110] 303. The denoised document image sample and the grayscale document image sample are fused together to obtain the preprocessed document image sample.

[0111] The edge detection device fuses the denoised document image sample with the grayscale document image sample to obtain the preprocessed document image sample. That is, the edge detection device fuses the denoised three-channel document image sample with the grayscale single-channel document image sample to obtain a four-channel document image sample as the preprocessed document image sample.

[0112] For example, in this embodiment, Gaussian filtering is applied to the initial document image sample. This is because edge detection algorithms are easily affected by noise in the image itself and cannot accurately extract edge information. Therefore, Gaussian filtering is first used to smooth the original image. Then, the smoothed image is converted into a single-channel grayscale image. This is because if the edges of a color image are directly detected, the edges in the r, g, and b channels need to be calculated separately. However, the gradient directions at the same point of each primary color may not be the same, resulting in different edges and inaccurate edge information. Therefore, converting it into a single-channel grayscale image can effectively solve the above problem. Then, the edge detection algorithm is used to extract the edge information of the grayscale image. The document image sample and the grayscale document image sample are further fused to form the document sample image.

[0113] In this embodiment, the document image samples are preprocessed to make the training model formed by the document image samples more accurate. That is, since the four vertices of the document are located on the edges of the document image sample, the preprocessing step of the input document image sample is different from other schemes that input RGB color document image samples or single-channel grayscale images. In this embodiment, an edge layer grayscale image information is added on the basis of the RGB color image. The input is a four-channel document image sample containing edge information to guide the convolutional layer to extract edge information features.

[0114] Reference Figure 4 , Figure 4 This is a schematic flowchart of an embodiment of the edge detection method for document images provided in this application.

[0115] In some embodiments of this application, the edge detection device performs edge detection on a document image, including the following steps:

[0116] 401, Obtain the image of the target document to be detected.

[0117] The edge detection device receives edge detection commands. The triggering method of the edge detection commands is not specifically limited. That is, the edge detection commands can be triggered manually by the user, for example, by the user clicking the edge detection button on the document image in the edge detection device; or the edge detection device can trigger the edge detection commands automatically, for example, by setting the edge detection device to automatically trigger the edge detection command when a new document image is detected.

[0118] After receiving the edge detection instruction, the edge detection device acquires the target document image to be detected. The number and type of the target document image are not limited. For example, the target document image can be a medical document image, a waybill image, an insurance document image, or other billing documents.

[0119] 402, extract feature points from the target document image to obtain the vertex information of the target document image.

[0120] The edge detection device extracts feature points from the target document image. The method of feature point extraction is not specifically limited. The edge detection device can perform pixel value analysis or it can be implemented through a preset edge detection model. That is, the edge detection device inputs the target document image into the preset edge detection model, and the edge detection device extracts features from the target document image through the preset edge detection model to obtain the vertex information of the target document image. The vertex information is information that represents the position and pixel value of the document vertices.

[0121] 403. Based on the vertex information, the vertex features and line segment features are obtained by fusing them.

[0122] The edge detection device performs convolution processing on vertex information using an edge detection model to obtain vertex features and line segment features; specifically, step 403 includes:

[0123] (1) The vertex coordinates and vertex pixel values ​​in the vertex information are integrated by the preset edge detection model to obtain vertex features;

[0124] (2) The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is processed according to the midpoint of each side of the detection rectangle to obtain line segment features.

[0125] That is, the edge detection device integrates the vertex coordinates and vertex pixel values ​​in the vertex information through a preset edge detection model to obtain vertex features; the edge detection device processes the vertex information through the preset edge detection model to form a detection rectangle; the edge detection device processes the target document image based on the midpoints of each side of the detection rectangle to obtain line segment features, specifically including:

[0126] a. Process the vertex information using the preset edge detection model to obtain the number of vertices, and generate a first detection rectangle based on the vertex information and the number of vertices;

[0127] b. Generate several detection boxes by fusing the vertex information through the preset edge detection model. Divide the detection boxes into four detection regions based on the midline connection of each detection box. Determine the second detection rectangle based on the corner points of the four detection regions corresponding to each detection box.

[0128] c. Merge the first detection rectangle and the second detection rectangle to obtain the line segment feature.

[0129] That is, the edge detection device can merge the detection rectangles generated by the preset edge detection model based on points and lines into a first detection rectangle. Simultaneously, the edge detection device directly generates a second detection rectangle based on points and lines; (Refer to...) Figure 5 , Figure 5 This is a schematic diagram of a scenario for edge detection of a document image in the edge detection method provided in the embodiments of this application. Figure 5 The image on the left shows the original image with bounding boxes, including labeled data. The feature points on the right indicate that the region is a text region. Assuming the width and height of the region are w and h respectively, there are w*h corresponding points. Each point corresponds to an offset in four directions. The edge detection device determines the number of w*h bounding boxes based on the number of feature points. The edge detection device calculates the confidence score of each bounding box and compares the confidence score of each bounding box with a preset confidence score. The preset confidence score can be set according to the specific scenario. For example, if the preset confidence score is set to 0.5, the edge detection device deletes the bounding boxes with a confidence score lower than the preset confidence score, obtaining the standard bounding boxes.

[0130] The edge detection device divides the detection area into four detection regions by connecting the points in the standard detection frame. The edge detection device determines the second detection rectangle based on the corner points of the four detection regions corresponding to each standard detection frame. That is, the edge detection device determines the top left corner point of each standard detection frame, and the top left corner point of the detection rectangle is obtained by averaging the points in the top left region of the four detection regions. The edge detection device obtains the top right corner point of the final rectangle by averaging the points in the top right region of the four detection regions. The other two points can be obtained in the same way, and finally the second detection rectangle is obtained.

[0131] 404. Regression processing is performed on the vertex features and the line segment features to obtain the edge detection results.

[0132] The edge detection device fuses vertex features and line segment features using a preset edge detection model to obtain the edge detection result. In this embodiment, by fusing vertex features and line segment features, the edge recognition is more accurate.

[0133] For ease of understanding, this embodiment provides an example, specifically:

[0134] In this embodiment, the final target detection box is obtained based on vertex features and the fused small detection boxes. Specifically, the following method is used: Select the locally optimal detection box. To improve the accuracy of the four points of the detection box, this scheme uses an improved approach. First, find the four initial detection vertices. Then, based on the midpoints of the sides of the rectangle formed by connecting the four vertices, divide the initial box into four regions. Each region corresponds to a vertex in one direction. Fuse the detection boxes obtained within the region to obtain the vertices within that region. This avoids the situation where the accuracy of the four vertex detection boxes obtained by global fusion is not high enough in regions far from the vertices, thus improving the detection accuracy of the four vertices after fusion. Then, based on the vertex information of the four vertices, determine the vertex features and line segment features. Further, the feature maps obtained by regressing the vertex features and line segment features can respectively obtain the four vertices. There may be cases where part of the target sample is not in the image. In this case, the number of vertices in the corresponding feature map obtained by vertex feature regression is less than four. However, this can be supplemented by extending the line segments to find the intersection points in the line segment features. Finally, the vertices in the four directions obtained by different methods are fused by mean fusion to obtain the final coordinates of the four vertices.

[0135] In this embodiment, the target document image is processed by a preset edge detection model to obtain vertex information. The preset edge detection model obtains vertex features and line segment features based on the vertex information, and determines the edge information of the target document image based on the vertex features and line segment features. By increasing the regression of vertices and line segments, the detection accuracy of target edges in the target document image is improved, enabling document edge detection in document images in complex scenarios. This ensures the universality of the preset edge detection model for document edge detection and improves the accuracy of edge recognition.

[0136] Reference Figure 6 , Figure 6 This is a schematic flowchart of an embodiment of the document content recognition in a target document image in the edge detection method for document images provided in this application.

[0137] The edge detection method for document images in this embodiment, specifically including the identification of document content in the target document image, includes:

[0138] 501. Based on the edge detection results, the target document image is divided into regions to obtain the document internal region of the target document image.

[0139] The edge detection device receives the document content recognition instruction, divides the target document image into regions based on the edge detection results, and obtains the document's internal region and document's external region.

[0140] 502, The characters in the internal area of ​​the document are identified by a preset character recognition model to obtain the document content information.

[0141] Edge detection devices identify characters within the internal area of ​​a document using a pre-set character recognition model. This pre-set model refers to a pre-trained character recognition algorithm, such as Optical Character Recognition (OCR). This process involves electronic devices examining characters in an image, determining their shape by detecting dark and light patterns, and then translating the shape into computer text. The resulting information includes the document's content, such as the mailing address, recipient information, and weight of the mailed item.

[0142] 503. Extract the document identifier from the document content information and associate and save the document identifier with the document content information.

[0143] The edge detection device extracts the document identifier from the document content information. The document identifier is a unique identifier for the document, such as the document number. The edge detection device associates and saves the document identifier with the document content information for easy retrieval later.

[0144] In this embodiment, after preparing to identify the edge information of the document, the edge detection device divides the target document image into regions according to the edge detection results to obtain the document's internal region. Then, it identifies the document's internal region to obtain the document content information and saves the document content information for easy retrieval later.

[0145] To better implement the edge detection method for document images in the embodiments of this application, an edge detection device for document images is also provided in the embodiments of this application, based on the edge detection method for document images, such as... Figure 7 As shown, Figure 7 This is a schematic diagram of an embodiment of a document image edge detection device; the document image edge detection device includes:

[0146] The acquisition module 601 is used to acquire the image of the target document to be detected;

[0147] The input module 602 is used to extract feature points from the target document image to obtain the vertex information of the target document image;

[0148] Processing module 603 is used to fuse the vertex information to obtain vertex features and line segment features;

[0149] The detection module 604 is used to perform regression processing on the vertex features and the line segment features to obtain edge detection results.

[0150] In some embodiments of this application, the edge detection device for the document image includes:

[0151] Acquire a large number of document image samples and label the vertex and edge information of the document image samples;

[0152] Preprocess the labeled document image samples;

[0153] Obtain document image samples, and extract features from the preprocessed document image samples to obtain vertex features and line segment features;

[0154] The initial detection model is trained using the vertex features and the line segment features to obtain feature point loss values ​​and feature line segment loss values;

[0155] Based on the feature point loss value and the feature line segment loss value, determine whether the obtained training detection model has reached the training endpoint;

[0156] If the training detection model does not reach the training endpoint, the parameters of the training detection model are adjusted until a preset edge detection model is obtained.

[0157] In some embodiments of this application, the edge detection device for the document image includes, in the aspect of preprocessing the marked document image samples:

[0158] The document image samples are filtered to obtain denoised document image samples;

[0159] The denoised document image sample is subjected to grayscale processing and edge extraction to obtain a grayscale document image sample with edge information.

[0160] The denoised document image sample and the grayscale document image sample are fused together to obtain the preprocessed document image sample.

[0161] In some embodiments of this application, the edge detection device for the document image includes the following aspects in acquiring document image samples and extracting features from the preprocessed document image samples to obtain vertex features and line segment features:

[0162] Feature extraction is performed on the preprocessed document image sample using a preset initial detection model to obtain labeled vertex information;

[0163] Obtain document image samples, and integrate the vertex coordinates and vertex pixel values ​​in the vertex information through the initial detection model to obtain vertex features;

[0164] The vertex information is regressed using the initial detection model to obtain line segment features.

[0165] In some embodiments of this application, the edge detection device for the document image, in performing the step of determining whether the obtained trained detection model has reached the training endpoint based on the feature point loss value and the feature line segment loss value, includes:

[0166] The mean squared error loss value is obtained by statistically analyzing the feature point loss value and the feature line segment loss value of each document image sample.

[0167] The mean squared error loss value is compared with a preset loss threshold.

[0168] If the mean squared error loss value is less than the preset loss threshold, then the obtained training detection model is determined to have reached the training endpoint.

[0169] If the mean squared error loss value is greater than or equal to the preset loss threshold, it is determined that the obtained training detection model has not reached the training endpoint.

[0170] In some embodiments of this application, the processing module 603 in the edge detection device for the document image includes:

[0171] The vertex features are obtained by integrating the vertex coordinates and vertex pixel values ​​in the vertex information using the preset edge detection model.

[0172] The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features.

[0173] In some embodiments of this application, the processing module 603 in the edge detection device for the document image performs the following steps: processing the vertex information through the preset edge detection model to form a detection rectangle; processing the target document image based on the midpoints of each side of the detection rectangle to obtain line segment features, including:

[0174] The vertex information is processed by the preset edge detection model to obtain the number of vertices, and a first detection rectangle is generated based on the vertex information and the number of vertices.

[0175] The vertex information is fused with the preset edge detection model to generate several detection boxes. Based on the midline connection of each detection box, four detection regions are formed corresponding to each detection box. Based on the corner points of the four detection regions corresponding to each detection box, a second detection rectangle is determined.

[0176] By fusing the first detection rectangle and the second detection rectangle, line segment features are obtained.

[0177] In some embodiments of this application, the edge detection device for the document image further includes:

[0178] The target document image is divided into regions based on the edge detection results to obtain the document internal region of the target document image;

[0179] The document content information is obtained by recognizing characters in the internal area of ​​the document using a preset character recognition model.

[0180] Extract the document identifier from the document content information, and associate and save the document identifier with the document content information.

[0181] In this embodiment of the invention, the edge detection device for document images processes the target document image to obtain vertex information, obtains vertex features and line segment features based on the vertex information, determines the edge information of the target document image based on the vertex features and line segment features, and improves the detection accuracy of target edges in the target document image by processing vertex features and line segment features to regress vertices and line segments. This enables document edge detection in document images in complex scenarios, ensures the universality of the preset edge detection model for document edge detection, and improves the accuracy of edge recognition.

[0182] This invention also provides an edge detection device for document images, such as... Figure 8 As shown, Figure 8 The diagram shows a schematic representation of the edge detection device for document images according to an embodiment of the present invention.

[0183] The document image edge detection device integrates any of the document image edge detection devices provided in the embodiments of the present invention, wherein the document image edge detection device includes:

[0184] One or more processors;

[0185] Memory; and

[0186] One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor in the steps of the document image edge detection method described in any of the embodiments of the document image edge detection method described above.

[0187] Specifically, the edge detection device for document images may include components such as a processor 701 with one or more processing cores, a memory 702 with one or more computer storage media, a power supply 703, and an input unit 704. Those skilled in the art will understand that... Figure 8 The edge detection device structure shown for the document image does not constitute a limitation on the edge detection device for document images. It may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:

[0188] The processor 701 is the control center of the edge detection device for the document image. It connects various parts of the edge detection device via various interfaces and lines. By running or executing software programs and / or modules stored in the memory 702, and by calling data stored in the memory 702, it performs various functions and processes data of the edge detection device, thereby providing overall monitoring of the edge detection device. Optionally, the processor 701 may include one or more processing cores; preferably, the processor 701 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 701.

[0189] The memory 702 can be used to store software programs and modules. The processor 701 executes various functional applications and data processing by running the software programs and modules stored in the memory 702. The memory 702 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, application programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created based on the use of the edge detection device for the document image, etc. In addition, the memory 702 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 702 may also include a memory controller to provide the processor 701 with access to the memory 702.

[0190] The edge detection device for document images also includes a power supply 703 that supplies power to the various components. Preferably, the power supply 703 can be logically connected to the processor 701 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 703 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.

[0191] The edge detection device for the document image may also include an input unit 704, which can be used to receive input digital or character information, and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

[0192] Although not shown, the edge detection device for document images may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 701 in the edge detection device for document images loads the executable files corresponding to the processes of one or more applications into the memory 702 according to the following instructions, and the processor 701 runs the applications stored in the memory 702 to realize various functions, as follows:

[0193] Acquire the image of the target document to be inspected;

[0194] Feature points are extracted from the target document image to obtain the vertex information of the target document image;

[0195] The vertex information is fused to obtain vertex features and line segment features;

[0196] Regression processing is performed on the vertex features and the line segment features to obtain the edge detection results.

[0197] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be implemented by instructions, or by instructions controlling related hardware. These instructions can be stored in a computer storage medium and loaded and executed by a processor.

[0198] Therefore, embodiments of the present invention provide a computer storage medium, which may include: read-only memory (ROM), random access memory (RAM), a disk, or an optical disk, etc. A computer program is stored thereon, which is loaded by a processor to execute the steps in any of the edge detection methods for document images provided in the embodiments of the present invention. For example, the computer program loaded by the processor can execute the following steps:

[0199] Acquire the image of the target document to be inspected;

[0200] Feature points are extracted from the target document image to obtain the vertex information of the target document image;

[0201] The vertex information is fused to obtain vertex features and line segment features;

[0202] Regression processing is performed on the vertex features and the line segment features to obtain the edge detection results.

[0203] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the detailed descriptions of other embodiments above, which will not be repeated here.

[0204] In practice, each of the above units or structures can be implemented as an independent entity or can be arbitrarily combined to be implemented as the same or several entities. For the specific implementation of each of the above units or structures, please refer to the previous method embodiments, which will not be repeated here.

[0205] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.

[0206] The foregoing has provided a detailed description of an edge detection method for document images provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the present invention. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A method of edge detection of a document image, characterized by, The edge detection method for the document image includes: Acquire the image of the target document to be inspected; Feature points are extracted from the target document image to obtain the vertex information of the target document image; The vertex information is fused to obtain vertex features and line segment features; Regression processing is performed on the vertex features and the line segment features to obtain the edge detection results; The process of fusing the vertex information to obtain vertex features and line segment features includes: Vertex features are obtained by integrating vertex coordinates and vertex pixel values ​​from the vertex information using a preset edge detection model. The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features. The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features, including: The vertex information is processed by the preset edge detection model to obtain the number of vertices, and a first detection rectangle is generated based on the vertex information and the number of vertices. The vertex information is fused with the preset edge detection model to generate several detection boxes. Based on the midline connection of each detection box, four detection regions are formed corresponding to each detection box. Based on the corner points of the four detection regions corresponding to each detection box, a second detection rectangle is determined. By fusing the first detection rectangle and the second detection rectangle, line segment features are obtained.

2. The edge detection method for document images as described in claim 1, characterized in that, The edge detection method for document images is applied to a preset edge detection model; Before extracting feature points from the target document image to obtain the vertex information of the target document image, the method includes: Obtain document image samples, and extract features from the preprocessed document image samples to obtain vertex features and line segment features; The preset initial detection model is trained using the vertex features and the line segment features to obtain feature point loss values ​​and feature line segment loss values. The preset initial detection model refers to a pre-set edge recognition algorithm. Based on the feature point loss value and the feature line segment loss value, determine whether the obtained training detection model has reached the training endpoint; If the training detection model does not reach the training endpoint, the parameters of the training detection model are adjusted until a preset edge detection model is obtained.

3. The edge detection method for document images as described in claim 2, characterized in that, After obtaining the document image sample, the method includes: The document image samples are filtered to obtain denoised document image samples; The denoised document image sample is subjected to grayscale processing and edge extraction to obtain a grayscale document image sample with edge information. The denoised document image sample and the grayscale document image sample are fused together to obtain the preprocessed document image sample.

4. The edge detection method for document images as described in claim 2, characterized in that, The process of acquiring document image samples and extracting features from the preprocessed document image samples to obtain vertex features and line segment features includes: Obtain document image samples, and extract features from the preprocessed document image samples using a preset initial detection model to obtain labeled vertex information; The vertex features are obtained by integrating the vertex coordinates and vertex pixel values ​​in the vertex information using the initial detection model. The vertex information is regressed using the initial detection model to obtain line segment features.

5. The edge detection method for document images as described in claim 2, characterized in that, The step of determining whether the obtained trained detection model has reached the training endpoint based on the feature point loss value and the feature line segment loss value includes: The mean squared error loss value is obtained by statistically analyzing the feature point loss value and the feature line segment loss value of each document image sample. The mean squared error loss value is compared with a preset loss threshold. If the mean squared error loss value is less than the preset loss threshold, then the obtained training detection model is determined to have reached the training endpoint. If the mean squared error loss value is greater than or equal to the preset loss threshold, it is determined that the obtained training detection model has not reached the training endpoint.

6. The edge detection method for document images as described in any one of claims 1-5, characterized in that, After performing regression processing on the vertex features and the line segment features to obtain the edge detection result, the method includes: The target document image is divided into regions based on the edge detection results to obtain the document internal region of the target document image; The document content information is obtained by recognizing characters in the internal area of ​​the document using a preset character recognition model. Extract the document identifier from the document content information, and associate and save the document identifier with the document content information.

7. An edge detection device for a document image, characterized in that, The edge detection device for the document image includes: The acquisition module is used to acquire the image of the target document to be detected; The input module is used to extract feature points from the target document image to obtain the vertex information of the target document image; The processing module is used to fuse the vertex information to obtain vertex features and line segment features; The detection module is used to perform regression processing on the vertex features and the line segment features to obtain edge detection results; The process of fusing the vertex information to obtain vertex features and line segment features includes: Vertex features are obtained by integrating vertex coordinates and vertex pixel values ​​from the vertex information using a preset edge detection model. The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features. The vertex information is processed by the preset edge detection model to form a detection rectangle. The target document image is then processed based on the midpoints of each side of the detection rectangle to obtain line segment features, including: The vertex information is processed by the preset edge detection model to obtain the number of vertices, and a first detection rectangle is generated based on the vertex information and the number of vertices. The vertex information is fused with the preset edge detection model to generate several detection boxes. Based on the midline connection of each detection box, four detection regions are formed corresponding to each detection box. Based on the corner points of the four detection regions corresponding to each detection box, a second detection rectangle is determined. By fusing the first detection rectangle and the second detection rectangle, line segment features are obtained.

8. An edge detection device for document images, characterized in that, The edge detection device for the document image includes: One or more processors; Memory; and One or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the steps in the document image edge detection method according to any one of claims 1 to 6.

9. A computer storage medium, characterized in that, It stores a computer program, which is loaded by a processor to perform the steps in the edge detection method for document images according to any one of claims 1 to 6.