Unlock instant, AI-driven research and patent intelligence for your innovation.

A method for identifying contract elements of pdf files

An identification method and contract technology, applied in the information field, can solve problems such as resource consumption, long time, and low efficiency

Active Publication Date: 2021-02-12
TIANGU INFORMATION SCI TECH HANGZHOU
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, each item can be marked manually, which is inefficient, takes a long time, and consumes a lot of resources
[0003] Based on this, a variety of solutions for automatically identifying contract content are currently provided, but the above-mentioned solutions are basically applied to Word files, and there is currently a lack of a method for automatically identifying contract elements in PDF files

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for identifying contract elements of pdf files
  • A method for identifying contract elements of pdf files
  • A method for identifying contract elements of pdf files

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068]The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0069]It should be noted that the embodiments of the present invention and the features in the embodiments can be combined with each other if there is no conflict.

[0070]The present invention will be further described below with reference to the drawings and specific embodiments, but it is not a limitation of the present invention.

[0071]The present invention includes a method for identifying contract elements of PDF files, such asfigure 1 As sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a method for identifying contract elements of a PDF file, comprising: reading the text blocks of the PDF file according to a preset reading method, and storing key information of each text block; wherein, the key information includes page number, text content and coordinates; according to the coordinates of the text blocks in the same page number, the text blocks of the same line are obtained, and the text blocks of the same line and the text blocks of two adjacent lines are divided into sentences; each sentence is identified according to the characteristics of the clause and the title , to identify the corresponding clauses and titles, and form the contract content according to the identified sentences; match the contract content with at least one contract template, and identify the contract content according to the matched contract module, so as to identify the contract elements. The beneficial effect of the present invention is to realize the formation of natural sentences from scattered and complex PDF text blocks; and to identify the contract content according to the matched contract modules, thereby improving the accuracy of identifying contract elements.

Description

Technical field[0001]The present invention relates to the field of information technology, in particular to a method for identifying contract elements of PDF files.Background technique[0002]In many contracts, the format of the contract is chaotic and there is no hierarchical relationship. The content of the contract appears to be the main text throughout, without structured data display. The business needs to disassemble the contract to identify different levels of titles, contract statement content, and contract terms. Currently, each item can be marked manually, which is inefficient, takes a long time, and consumes a lot of resources.[0003]Based on this, a variety of schemes for automatically identifying contract content are currently provided, but the above schemes are basically applied to Word files, and there is currently a lack of a method for automatically identifying contract elements in PDF files.Summary of the invention[0004]In view of the above-mentioned problems in the p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F40/295G06F40/205
CPCG06F16/3344G06F40/205G06F40/295
Inventor 石伟坚金宏洲程亮
Owner TIANGU INFORMATION SCI TECH HANGZHOU