PDF form extraction method
Patent Information
- Authority / Receiving Office
- CN · China
- Current Assignee / Owner
- 南京述酷信息技术有限公司
- Publication Date
- 2017-06-27
Smart Images

Figure 1 
Figure 2 
Figure 3
Abstract
Description
technical field
[0001] The invention relates to the technical field of PDF document data mining and extraction, in particular to a method for extracting PDF tables. Background technique
[0002] PDF (Portable Document Format) is a portable document format, a file format developed by Adobe Systems for file exchange, which has no interaction with applications, operating systems, and other hardware. PDF documents are based on the PostScript language image model, ensuring that PDF documents can have accurate colors and accurate printing effects on any printer, that is, PDF will faithfully reproduce every character, color and image in the PDF document. . With the rapid development of computer and Internet technology, PDF documents are more and more widely used in various fields such as economy, finance, education, scientific research and academics. Since the design purpose of PDF is only to display documents or print documents, it does not have the function of communicating and...