Atlas data reduction method based on PDF file analysis

A file analysis and map technology, which is applied in the direction of text database query, unstructured text data retrieval, special data processing applications, etc., can solve the problems of incomplete report data and poor analysis of maps, so as to facilitate automatic analysis, Facilitate unified management and quickly analyze the effect of results

Pending Publication Date: 2021-05-28
刘羽
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As far as we know, the analysis of PDF files is usually only for the character data in the file according to the rules, and t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Atlas data reduction method based on PDF file analysis
  • Atlas data reduction method based on PDF file analysis
  • Atlas data reduction method based on PDF file analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Target PDF page see figure 2, there is a coordinate axis frame 2 drawn by the LTCurve object and an integral line 5 drawn by the LTLine object in the PDF atlas of this embodiment. see image 3 .

[0037] 1. Use software to analyze the PDF. By analyzing the path object (Path Object) generated with the PDF page as the reference in the file, this type of path object is defined as an LTRect object in Pdfminer, and the x1 in the properties of this type of object is calculated. The maximum value of -x0 and y1-y0, analyze the position information of the qualified LTRect object, and obtain the spectrum range 1.

[0038] 2. Use software to analyze the PDF, and generate a path object (Path Object) for displaying the map by using the PDF page as a reference in the analysis file. This type of path object is defined as an LTCurve object in Pdfminer, and the LTCurve object is processed Identify, distinguish between axis frame 2 and map curve 4, see Figure 5 . Analyze the path ...

Embodiment 2

[0048] The analyzed spectrum is the same as in Example 1, and the implementation ideas are similar, except that the selected specific points for calculation are respectively the specific points 13 on the ordinate axis and the specific points on the abscissa axis with identifiable scale marks 11, see attached image 3 , instead of finding the absolute coordinates of a specific point by reading the data summary table.

Embodiment 3

[0050] The analyzed spectrum is the same as Example 1, and the implementation ideas are similar, except that one of the specific points selected for calculation is the starting point of the spectrum: specific point 12. In a map similar to this embodiment, the starting point is usually the origin by default, so its relative coordinate is the first coordinate in the relative coordinate data of the map. The absolute coordinates of this position are (0, 0); another specific point is the specific point 11 on the abscissa axis with identifiable scale marks, see image 3 , instead of finding the absolute coordinates of a specific point by reading the data summary table.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an atlas data reduction method based on PDF (Portable Document Format) file analysis. The method comprises the following steps: obtaining an atlas position range by analyzing a file; identifying and classifying data with different functions and relative coordinates according to position attributes of various related objects in the atlas; obtaining relative coordinates and absolute coordinates of a specific point in the atlas through a mutual relation between the data, and further obtaining a horizontal coordinate correction coefficient and a vertical coordinate correction coefficient corresponding to the relative coordinates and the absolute coordinates; and converting the obtained relative coordinate data to obtain absolute coordinate data for constructing the atlas, thereby realizing the reduction of the PDF atlas data. Herein, the map content in the PDF format is converted into data which reflects map characteristics, has a numerical value close to that of original data and can be operated and retrieved, so that the use of the map data is not limited by an original special system, a workstation and a working program, the convenience of exchange, query and comparison of the map data is improved, and the unified management of the data is facilitated.

Description

technical field [0001] The invention relates to a map data restoration method based on PDF file analysis, and belongs to the field of file data analysis. Background technique [0002] As an important means of scientific research, atlas plays a huge role in analytical experiments. The spectrum usually appears in the form of a scatter diagram containing ordinates and abscissas, usually showing continuous changes, and the ordinates and abscissas have a characteristic correlation. For example, the liquid phase spectrum: the corresponding relationship between the absorption value of the eluted substance and the elution time; the scanning spectrum of the ultraviolet spectrophotometer: the corresponding relationship between the sample absorbance value and the wavelength of the step change; X-ray diffraction of the crystal: the step change The correspondence between the diffraction angle 2θ and the intensity value I and so on. [0003] This characteristic correlation directly or i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06F16/33
CPCG06F16/3331G06V30/40
Inventor 刘羽王贺王辉李姜晖刘永付俐
Owner 刘羽
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products