PDF file conversion method based on OCR pre-judgment
A file conversion and pre-judgment technology, applied in the field of PDF file conversion, can solve the problems of low judgment accuracy and text deviation, and achieve the effect of strong applicability, improved accuracy and good conversion effect.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0030] Refer to attached Figure 1-4 Embodiments of the present invention will be specifically described.
[0031] Such as figure 1 As shown, a PDF file conversion method based on OCR pre-judgment includes the following steps:
[0032] Step 1: Parse the PDF file to determine whether OCR is required for each page in the PDF file;
[0033] Step 2: perform ocr on the pages that need to be ocr to obtain text information; directly extract the text information from the text encoding information of the text object in the PDF page for the pages that do not need to be ocr;
[0034] Step 3: convert the obtained text information into a corresponding editable document through the PDF parsing algorithm and the Office file reconstruction algorithm.
[0035] We call pdf files such as scanned pdfs, PDFs converted from images, and other image-based pdf files image-based, image-type PDFs cannot directly extract their text information, and we cannot obtain its text by parsing such PDF files ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap