A method for table recognition of pdf documents
A table and document technology, applied in the field of PDF document table recognition, can solve the problems affecting the correct rate and efficiency of table recognition, and achieve the effect of improving the correct rate and improving the recognition efficiency.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0024] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the embodiments and accompanying drawings.
[0025] like figure 1 Shown, be the method flow chart of PDF document form identification, described method comprises:
[0026] Obtain the character set in the page, and merge the character set into a row to create a row set;
[0027] Extract the horizontal and vertical lines in the page path, and create a set of lines;
[0028] Detect suspected table headers in the row set and suspected table lines in the lines set;
[0029] If there are suspected table titles and suspected table lines at the same time, use the region growing method based on the table title and line set to identify the table;
[0030] If there are only suspected table lines, use the line set and row set to first detect the full-line table and then the three-line table;
[0031]...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com