Acquisition method and device of PDF (Portable Document Format) documentation comment

An acquisition method and an acquisition device technology, which are applied in the field of acquisition of PDF document annotations, can solve problems such as the inconvenience of direct use of PDF document annotations, and achieve the effect of convenient extraction and subsequent processing

Inactive Publication Date: 2016-03-02
PEKING UNIV FOUNDER GRP CO LTD +2
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a method and device for acquiring PDF document annotations, which can solve the problem of inconvenient direct use of PDF document annotations in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acquisition method and device of PDF (Portable Document Format) documentation comment
  • Acquisition method and device of PDF (Portable Document Format) documentation comment
  • Acquisition method and device of PDF (Portable Document Format) documentation comment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0058] Such as figure 1 Shown, the acquisition method of PDF document comment of the present invention, comprises the steps:

[0059] Step 11, analyzing the architecture of the PDF document to obtain the cross-reference table of the PDF document;

[0060] Step 12, searching the cross-reference table to obtain a Trailer dictionary at the end of the file;

[0061] Step 13, analyzing the Trailer dictionary at the end of the file to obtain the Catalog dictionary corresponding to the key value Root;

[0062] Step 14, searching the Catalog dictionary to obtain the page dictionary of the PDF document; wherein, the page dictionary includes: pages of the PDF document;

[0063] Step 15, searching the page dictionary of the PDF document to obtain the annota...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an acquisition method and device of a PDF (Portable Document Format) documentation comment. The method comprises the following steps: analyzing a system structure of a PDF document, and obtaining a cross index table of the PDF document; retrieving the cross index table to obtain an end-of-file Trailer dictionary; analyzing the end-of-file Trailer dictionary to obtain a Catalog dictionary corresponding to key value Root; retrieving the Catalog dictionary to obtain a page dictionary of the PDF document, wherein the page dictionary comprises a PDF document page; and retrieving the page dictionary of the PDF document to obtain a page comment of the PDF document. The scheme of the invention can conveniently, accurately and efficiently extract the comment in the PDF document, and brings convenience for the user to carry out subsequent processing on the extracted comment.

Description

technical field [0001] The present invention relates to the field of information extraction, in particular to a method and device for acquiring annotations of PDF documents. Background technique [0002] Comments are added through PDF reading tools when people read PDFs, and are usually comments on certain content of PDF documents added by people during the reading process. These annotations are of great significance for future reuse. For the same PDF with the same content, different users may give different comments. [0003] The PDF format has its distinctive technical characteristics, such as superior cross-platform; it can integrate multiple media information publishing and distribution, and can integrate electronic information such as hypertext links, audio and dynamic images; it provides support for network information release. Among them, in terms of the credibility and reliability of PDF, the maintenance of information integrity and consistency, and the maintenance...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘利川
Owner PEKING UNIV FOUNDER GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products