Unlock instant, AI-driven research and patent intelligence for your innovation.

A fault-tolerant information extraction method for contract documents based on graph attention network

A technology of information extraction and attention, applied in unstructured text data retrieval, text database clustering/classification, instruments, etc., can solve problems such as errors in recognition results, achieve high accuracy, wide application range, and high recognition efficiency Effect

Active Publication Date: 2022-05-24
四川国路安数据技术有限公司
View PDF20 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to realize a fault-tolerant information extraction method of contract documents based on graph attention network, so as to solve the problem in the prior art that "when the text printing in the original standard image is misplaced, the system information extraction algorithm will suffer from printing misalignment". Influenced by the problem, resulting in errors in the recognition results", it has better accuracy in extracting misplaced information in standard images

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A fault-tolerant information extraction method for contract documents based on graph attention network
  • A fault-tolerant information extraction method for contract documents based on graph attention network
  • A fault-tolerant information extraction method for contract documents based on graph attention network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] In the following, only certain exemplary embodiments are briefly described. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the embodiments of the present application. Accordingly, the drawings and description are to be regarded as illustrative in nature and not restrictive.

[0072] The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0073] The embodiments of the present application provide a method for extracting fault-tolerant information from contract documents based on a graph attention network, which is used to extract effective information from standard image data formed by contract documents.

[0074] like figure 1 As shown, an embodiment of the method for extracting fault-tolerant information from contract documents based on graph attention network provided by this application i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for extracting fault-tolerant information of a contract document based on a graph attention network, which relates to the technical field of computer and information processing; the invention first performs character recognition on the contract through an OCR engine to obtain text content and corresponding position coordinates; then extracts Text information features, including the position vector of text information and the word embedding representation of text strings; then use the features extracted from contract documents as graph node features to construct a fault-tolerant contract text relationship graph; then set the layers of the graph attention network structure and activation function; then input the training set into the constructed graph attention network for training until the loss function converges; finally, model the contract to be recognized into a text relationship graph and input it into the trained graph attention network, Finally, the category of the text information is obtained. The invention realizes the extraction of dislocation information of contract documents, has higher recognition efficiency and accuracy than the existing post-OCR information extraction technology, and is beneficial to office intelligence.

Description

technical field [0001] The invention relates to the technical field of computer and information processing, in particular to a method for extracting fault-tolerant information of contract documents based on a graph attention network. Background technique [0002] With the development of network and computer technology, computer intelligent algorithms have been widely used in Internet finance, Internet government affairs and other fields as business assistance technology. Among them, optical character recognition (OCR), as the core key technology, plays a pivotal role: commercial banks, insurance and other financial industries often use OCR technology to realize automatic identification of receipts, invoices or contract content, so as to avoid the cumbersome staff. Input operation to improve work efficiency and user experience; in the field of Internet + government affairs services, using COR technology to identify key information of proof materials such as house purchase con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V30/148G06V10/774G06F16/35
Inventor 高菱范攀
Owner 四川国路安数据技术有限公司