Positioning method and device for tables in PDF document

A positioning method and table technology, applied in the field of data processing, can solve problems such as poor table recognition accuracy, achieve high accuracy, fast positioning speed, and meet the effects of online real-time processing

Active Publication Date: 2018-08-31
ABC FINTECH CO LTD
View PDF9 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to improve the defects of poor form recognition a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Positioning method and device for tables in PDF document
  • Positioning method and device for tables in PDF document
  • Positioning method and device for tables in PDF document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the present invention.

[0043] see figure 1 A method for locating tables in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a positioning method and device for tables in a PDF document. The method comprises the steps that the PDF document containing the tables is received; character information andstraight line information are extracted from vector flow information of the PDF document; and according to the extracted character information and straight line information, a table region in the PDFdocument is positioned. According to the positioning method and device, table region positioning is performed based on all straight lines and text blocks in pages; and compared with the prior art, the accuracy of table region positioning can be improved, and consequently a basis is provided for accurate analysis of table information.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method and device for locating tables in PDF documents. Background technique [0002] PDF documents are based on the PostScript language image model. For any printer, PDF can faithfully reproduce every character, color and image of the original. The characteristics that PDF has nothing to do with the operating system platform make it the most widely used ideal document format in electronic document distribution and digital information dissemination. [0003] Although the PDF document can accurately display the layout, the structural information in the PDF, such as the position information of the table, is not effectively recorded and stored, which makes it difficult to restore the table information in the PDF. The Chinese patent application with publication number CN105589841A provides a PDF document table recognition method, which uses the title feature and table line...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/24
CPCG06F40/183G06F40/177G06F40/103
Inventor 余宙杨永智汪贤
Owner ABC FINTECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products