Formatted extraction method for printing content
A technology for printing content and extraction methods, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problems of floating positioning of extracted information, difficulty in extracting complex content, and determining the number of lines, so as to simplify design difficulty and improve calculation Efficiency, the effect of improving efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0039] The technical solution of this patent will be further described in detail below in conjunction with specific embodiments.
[0040] see figure 1 , in an embodiment of the present invention, a method for extracting formatted print content, comprising the following steps:
[0041] S1. Convert the printed content of the printed document into printed elements (including the text content and the x, y coordinates of the upper left corner of the relative page, as well as the height and width information displayed by the text content), and generate a set of printed elements (including the name of the printed document, The total number of printed pages, the index number of each page, the height and width of each page, the printing elements contained in each page, and the independent page picture of each page);
[0042] S2. Design extraction elements based on the sampled print element set (mainly including extraction element types, keywords, extraction range (extract x, y coordin...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 
