Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Document comparison analysis method and system based on table structure analysis

A technology of table structure and analysis method, applied in the field of data processing, can solve problems such as limitations, high resource occupation, poor comparison effect, etc., and achieve the effect of strengthening the scope of application

Active Publication Date: 2022-02-08
杭州实在智能科技有限公司
View PDF6 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patented technology allows for better organization by aligning pictures with tables or other items that are being displayed together within an electronic system's screen space. It also includes mechanisms for identifying changes made during processing (such as adding new pages) without affecting previous ones. Overall, this technology improves efficiency and accuracy when displaying documents across different screens.

Problems solved by technology

This patented technical problem addressed in this patents relates to improving the quality control (QC) of documentation verification processes due to variations introduced over time through various factors like format type, file organization structures, processing techniques, and human subjectivity. Current solutions require manually checkings every few months but they lack precision because these checks only work well within specific boundaries. Additionally, conventional approaches use complex algorithms involving pixel templates match technology, resulting in high computational costs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document comparison analysis method and system based on table structure analysis
  • Document comparison analysis method and system based on table structure analysis
  • Document comparison analysis method and system based on table structure analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] A document comparison and analysis method based on table structure analysis, comprising the following steps;

[0069] S1, receiving various types of source files and converting them into PDF files;

[0070] S2, for different types of content parts in the PDF file, use different tools to extract, divide and identify, and obtain table data and non-table data with text content, coordinate information, and table structure;

[0071] S3, compare the table data and non-table data respectively, and finally obtain the text difference outside the table and the table difference.

[0072] Convert different types of files to PDF uniformly, because PDF format files can maintain the stability of the document format, no matter whether it is cross-system platform or printing, there will be no structural confusion. At the same time, regardless of pictures or commonly used WORD (Microsoft Office software word processing software) documents, they can be converted to PDF, and the unified f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a document comparison analysis method and system based on table structure analysis. The method comprises the following steps of: S1, receiving various types of source files, and uniformly converting the source files into PDF files; S2, aiming at different types of content parts in the PDF file, extracting, dividing and identifying by using different tools to obtain table data and non-table data with text contents, coordinate information and table structures; and S3, respectively comparing the table data with the non-table data to finally obtain out-of-table text difference and table difference. The system comprises a file conversion module, a file identification module and a data comparison module. The method has the advantages that the method focuses on comparison of the document content and the semantic level, has the ability of comparison between table structures and semantics in document comparison, and is good in comparison effect, low in occupied resource and accurate in character recognition.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Owner 杭州实在智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products