Method and system for converting file from portable document format (PDF) to electronic publication (EPUB) format
A format file and file technology, applied in the field of document processing, can solve problems such as resolution degradation and difficult text recognition
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0066] see figure 1 , is a flow chart of the method for converting a PDF format file into an EPUB format according to Embodiment 1 of the present invention. like figure 1 As shown, the method includes the steps of:
[0067] S101: Identify text elements and image elements in the PDF format file;
[0068] Since the properties of the text element and the image element are different, when the PDF format file is read, the data stream of the text element and the data stream of the image element have different identifiers respectively. Therefore, the text elements and image elements in the PDF file can be identified according to the identifier in the data stream.
[0069] S102: Obtain the coordinates of the text element and the coordinates of the image element;
[0070] S103: According to the coordinates of the text element and the coordinates of the image element, determine the position of the text element and the image element in the newly generated HTML format file, so that th...
Embodiment 2
[0081] see figure 2 , is a flow chart of the method for converting a PDF format file into an EPUB format according to Embodiment 2 of the present invention. This embodiment illustrates the practical application process of the present invention in more detail. like figure 2 As shown, the method includes the steps of:
[0082] S201: Identify text elements and image elements in the PDF format file;
[0083] S202: Obtain the coordinates of the text element and the coordinates of the image element;
[0084] S203: Determine whether the ordinate of the lower right point of the text element is smaller than the ordinate of the upper left point of the image element;
[0085] If yes, execute step S204; otherwise, execute step S205;
[0086] S204: Position the text element above the image element;
[0087] S205: Determine whether the abscissa of the lower right point of the text element is smaller than the abscissa of the upper left point of the image element;
[0088] If yes, ex...
Embodiment 3
[0101] Compared with the second embodiment, this embodiment adopts another way of determining the positions of the text elements and the image elements in the newly generated HTML format file.
[0102] see image 3 , is a flow chart of the method for converting a PDF format file into an EPUB format described in Embodiment 3 of the present invention.
[0103] like image 3 As shown, the method includes the steps of:
[0104] S301: Identify text elements and image elements in the PDF format file;
[0105] S302: Obtain the coordinates of the text element and the coordinates of the image element;
[0106] S303: Determine whether the ordinate of the upper left point of the text element is greater than the ordinate of the lower right point of the image element;
[0107] If yes, execute step S304; otherwise execute step S305;
[0108] S304: Position the text element below the image element;
[0109] S305: Determine whether the abscissa of the upper left point of the text element ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 