Method and device for identifying reading sequence of layout
A technology of reading order and layout, which is applied in the fields of instruments, computing, and electronic digital data processing, etc., and can solve the problem of poorly applied complex e-book page reading order recognition, low accuracy rate of reading order recognition and content rearrangement, image elements Do not participate in issues such as sorting
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example
[0104] In the present embodiment, adopt electronic book " 21st century computer basic tutorial " (Beijing University of Posts and Telecommunications Press), this electronic book has 317 pages, read the 135th page arbitrarily therefrom (as Figure 8 shown) as the layout to be recognized.
[0105] According to the present invention, identifying the page reading order includes the following steps:
[0106] (1) read Figure 8 the page shown, and analyze the layout to obtain layout information and object properties of character text objects and image objects.
[0107] (2) The character text object and the image object obtained in step (1) are carried out logical paragraph recognition, specifically as follows:
[0108] Step S21, calculate the plate center as C(65, 36, 387, 529), where the coordinate unit is pound, the coordinate origin is the lower left vertex, and the plate center is as Figure 8 Shown in the dotted rectangle box;
[0109] Step S22, judge whether the text chara...
no. 2 example
[0122] In this embodiment, the layout to be identified is such as Figure 10 a and Figure 10 As shown in b, in this layout, there is a circular layout.
[0123] After passing through the above logical paragraph recognition steps, the paragraph recognition result of this layout is as follows Figure 10 a and Figure 10 Each rectangular frame in b is shown. After observation, it is found that the paragraphs on this page have no cutting position in the X direction and Y direction, and essentially constitute a whole ring. At this time, if only the global sorting method is used, the cutting algorithm will not be executed and will jump out directly, so the obtained reading order is the natural output order of the pages, that is, the typesetting order. And the layout order of this page is as follows Figure 10 As shown in b, it is in the order of "text near the page number - text in the left column - title - text in the right column - legend in the left column - legend in the r...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com