Document recognition method and device and storage medium
A document and recognition model technology, applied in the field of text recognition, can solve problems such as Excel tables that cannot recognize special shapes
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0059] Figure 1A It is a flow chart of a document identification method provided by Embodiment 1 of the present invention. This embodiment is suitable for identifying information in non-editable documents (such as pictures or documents in PDF format), especially for identifying irregular forms and text belonging to the forms in non-editable documents. The method can be performed by a device for document identification, which can be implemented by software and / or hardware, and which can be configured in an electronic device with data processing capabilities, such as a mobile phone, a tablet computer, a wearable device, etc. (such as smart glasses, smart watches), etc., the electronic device is equipped with a screen and a central processing unit (CPU).
[0060] refer to Figure 1A , the method specifically includes:
[0061] S101. Receive a first document.
[0062] There are pages in the first document, and the number of pages is not limited. Each page can include different...
Embodiment 2
[0085] Figure 2A It is a flow chart of a document identification method provided by Embodiment 2 of the present invention. This embodiment is refined on the basis of the first embodiment, and describes in detail the specific steps of locating the sub-region formed by the intersection points in the region. refer to Figure 2A , the method includes:
[0086] S201. Receive a first document.
[0087] S202. Determine an element identification model.
[0088] Element recognition models are pre-trained models for recognizing target elements. The model can be constructed by means of deep learning or neural network.
[0089] In a feasible implementation manner, an ANN classification model is built through training samples to identify target elements, and is applied to test samples to output detection results. First, for a given sample pair {(xi,yi), xi∈RN, yi={0,1,2,...,100}}, where xi is the training sample and x is the sample to be judged, a parameter The adaptively adjusted ...
Embodiment 3
[0117] image 3 A structural diagram of a device for document identification provided by Embodiment 3 of the present invention. The device comprises: a first document receiving module 31, an area extracting module 32, an intersection detection module 33, a sub-region determining module 34, a character recognition module 35, a second form generating module 36 and a second form writing module 37, wherein:
[0118] The first document receiving module 31 is configured to receive a first document, the first document has pages;
[0119] an area extraction module 32, configured to extract an area having a target element from the page, and the target element includes a first table;
[0120] An intersection detection module 33, configured to detect an intersection in the area, where the intersection is a position where at least two line segments intersect;
[0121] A sub-area determining module 34, configured to locate in the area a sub-area composed of the intersection points, the s...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com