Method for converting form in portable document format (PDF) document into Excel form

An excel and table technology, applied in the field of information conversion, can solve the problems of unable to generate Excel table, unable to correctly identify table data, etc., to achieve the effect of improving the degree of restoration and editability

Inactive Publication Date: 2012-10-10
WONDERSHARE TECH CO LTD
View PDF5 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] When a table in a PDF document is converted into an Excel table, it is recognized as a table according to the border line in the document table, the content in the table is extracted, and written into the corresponding Excel table in a certain order. Excessive reliance, when converting, cannot correctly identify table data without border lines or incomplete border lines, and cannot generate corresponding Excel tables

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for converting form in portable document format (PDF) document into Excel form
  • Method for converting form in portable document format (PDF) document into Excel form
  • Method for converting form in portable document format (PDF) document into Excel form

Examples

Experimental program
Comparison scheme
Effect test

example 2

[0067] instance 2, Figure 4 In the area of ​​the 5th column and the 2nd row, the "H" text block is equivalent to the third text block above, and the "I" text block is equivalent to the fourth text block above, so these two text blocks are used as the benchmark, and the " The position coordinates of the right boundary of the H" text block is divided into one column vertically, and the maximum number of columns is increased by 1 column, which is 6 columns, such as Figure 5 shown.

[0068] Judging whether the divided columns meet the preset column setting requirements, the sixth text block is located on the right side of the right end of the fifth text block, the left side of the left end of the seventh text block, the right end of the fifth text block, the seventh text block The four coordinates of the right end of the block and the left and right ends of the sixth text block are located between coordinates of different columns.

[0069] If an independent sixth text block ap...

example 3

[0070] instance 3, Figure 5 In the area of ​​the 2nd column and the 6th line, the "L" text block is equivalent to the above-mentioned fifth text block, the "M" text block is equivalent to the above-mentioned sixth text block, and the "N" text block is equivalent to the above-mentioned seventh text block. Between the rightmost end of the "L" text block and the leftmost end of the "N" text block, an independent text block "M" appears, and the "M" text block is used as the benchmark, and the right boundary position coordinates are used as the boundary for vertical division. Increase the maximum number of columns by 1 column to 7 columns, such as Figure 6 shown.

[0071] Step 3: According to the position coordinates of the text blocks, determine the divided areas to which each text block belongs.

[0072] According to the position coordinates of the upper end and the left end of the text block, divide the text blocks in the non-reference row into the reference row, divide the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for converting a form in a portable document format (PDF) document into an Excel form. The method comprises the following steps of: identifying boundary position coordinates of text blocks in the form of the PDF document, performing row division and column division on the form in the PDF document according to the boundary position coordinates of the text blocks to acquire a plurality of divided areas, determining the divided area of each text block, and writing the text blocks in the divided areas into the corresponding Excel form, so that an aim of converting the form in the PDF document without border lines or with incomplete border lines into the Excel form without depending on the border lines of the form in the PDF document is fulfilled.

Description

technical field [0001] The invention relates to the field of information conversion, in particular to a method for converting a form in a PDF document into an Excel form. Background technique [0002] PDF is the abbreviation of Portable Document Format (Portable Document Format), which is an electronic document format. This file format has nothing to do with the operating system platform, whether it is in Windows, Unix or Msc OS operating system is common. The PDF file format can encapsulate text, fonts, formats, colors, and graphic images independent of devices and resolutions in one file, and will faithfully reproduce every character, color, and image of the original manuscript, and ensure accurate printing on the printer. Color and accurate printing effect. Files in this format can also contain electronic information such as hypertext links, sounds, and dynamic images. It supports extra-long files, and has a high degree of integration, security and reliability. Therefo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/22G06F17/25G06F40/189
Inventor 原野
Owner WONDERSHARE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products