PDF file analysis method, device and equipment and computer readable storage medium

A file parsing and file technology, applied in the field of file parsing, can solve the problems of untargeted parsing, low parsing efficiency, and incomplete data.

Pending Publication Date: 2019-05-07
ONE CONNECT SMART TECH CO LTD SHENZHEN
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The main purpose of the present invention is to provide a PDF file parsing method, device, equipment and computer-readable storage medium, aiming at solving

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • PDF file analysis method, device and equipment and computer readable storage medium
  • PDF file analysis method, device and equipment and computer readable storage medium
  • PDF file analysis method, device and equipment and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0049] The invention provides a PDF file parsing method.

[0050] Please refer to figure 1 , figure 1It is a schematic flowchart of the first embodiment of the PDF file parsing method of the present invention. In this embodiment, the PDF file parsing method includes:

[0051] Step S10, when receiving the PDF file to be parsed, identifying the keyword sample carried by the PDF file to be parsed, and determining the content type of the PDF file to be parsed according to the keyword sample;

[0052] The PDF file parsing method of the present invention can be applied not only to servers but also to terminals such as mobile computers and desktop computers. A PDF (Portable Document Format, Portable Document Format) file is a graphic-text format file, and the information carried in the PDF file is extracted by parsing th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a PDF (Portable Document Format) file analysis method, device and equipment and a computer readable storage medium, and the method comprises the following steps: when receivinga to-be-analyzed PDF file, identifying a keyword carried by the to-be-analyzed PDF file, and determining the content type of the to-be-analyzed PDF file according to the keyword; Calling each analysis template corresponding to the content type, respectively matching the to-be-analyzed PDF file with each analysis template, and determining a target analysis template according to the matching rate of the obtained to-be-analyzed PDF file and each analysis template; And analyzing the PDF file to be analyzed according to an analysis rule in the target analysis template to generate analysis data. According to the scheme, the PDF file to be parsed is parsed through the parsing rule in the target parsing template matched with the PDF file to be parsed, the integrity of data parsing in the PDF fileto be parsed can be ensured, the pertinence is high, and the parsing efficiency is improved.

Description

technical field [0001] The present invention mainly relates to the technical field of file parsing, and in particular, relates to a PDF file parsing method, device, equipment and computer-readable storage medium. Background technique [0002] At present, with the development of big data technology, the statistical analysis of data has penetrated into all levels of society, such as the company's monthly report, annual report, personal loan information, tax information, etc. Part of this kind of data comes from PDF files, and the data in this part of PDF files can be obtained by parsing PDF files. Currently, parsing is performed on PDF files one by one in a unified way. For different types of PDF files, the parsing methods are not specific. performance, the analysis efficiency is low, and the problem of incomplete data analysis is prone to occur. Contents of the invention [0003] The main purpose of the present invention is to provide a PDF file parsing method, device, equ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
Inventor 夏良超王盼
Owner ONE CONNECT SMART TECH CO LTD SHENZHEN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products