Document processing method, device and system, electronic equipment and storage medium
A document processing and document technology, applied in the direction of electronic digital data processing, special data processing applications, instruments, etc., can solve the problems of paper management and storage, information cannot be effectively retrieved, etc., to achieve the effect of improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0036] The embodiment of this application provides a document processing method, such as figure 1 shown, including:
[0037] S101: Acquiring an image of the first historical document;
[0038] Exemplarily, the first historical document may be one of multiple historical documents currently required to be stored, and any one of them is referred to as the first historical document. For each historical document, the solution provided by the present application may be used for follow-up Processing, this embodiment will not repeat them one by one.
[0039] In addition, the first historical document may be a book, and correspondingly, the image of the first historical document may be composed of one or more images. It can be understood that if a book is to be archived electronically, all the pages in the book can be scanned to obtain the corresponding image of each page as the image of the first historical document. Since the same follow-up processing is adopted no matter whether ...
Embodiment 2
[0048] On the basis of the foregoing embodiment one, as figure 2 As shown, after the image of the first historical document is acquired, it may further include: S100: Perform preprocessing on the image of the first historical document to obtain a preprocessed image of the first historical document.
[0049] In this embodiment, the preprocessing of the image of the first historical document may include noise removal, image binarization, tilt correction, and the like. Here, when the image of the first historical document is scanned, due to the influence of factors such as the paper quality of the first historical document itself, the degree of illumination during scanning, etc., the scanned image is generally mixed with noise and defects. In addition, factors such as uneven edges of the paper, uneven placement of the paper, or poor deviation correction performance of the scanner will cause the scanned image to be skewed. These will reduce the accuracy of the next document imag...
Embodiment 3
[0061] Such as image 3 As shown, in Example 1 figure 1 on the basis of figure 1 S102 in can specifically include:
[0062] S1021: Perform region division on the image of the first historical document to obtain at least one type of region among a table region, a text region, and a picture region.
[0063] The aforementioned embodiments have mentioned that the first historical document can correspond to one or more pictures, and each image can be divided into regions to obtain at least one type of table region, text image, and picture region corresponding to each image . Since the feature extraction methods of different areas such as text area, picture area, and table area are different, it is necessary to divide different areas. Specifically, the way to divide the area can be as follows:
[0064] The detection of the image area and the table area is applied to the first model.
[0065] Specifically, the first model may be an M2Det model. The M2Det model is based on MLFP...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com