Method for extracting fault-tolerant information of contract document based on graph attention network

An information extraction and attention technology, applied in unstructured text data retrieval, text database clustering/classification, instruments, etc., can solve problems such as incorrect recognition results, achieve high accuracy, wide application range, and high recognition efficiency Effect

Active Publication Date: 2022-04-12
四川国路安数据技术有限公司
View PDF20 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to realize a fault-tolerant information extraction method of contract documents based on graph attention network, so as to solve the problem in the prior art that "when the text printing in the original standard image is misplaced, the system information extraction algorithm will suffer from printing misalignment". Influenced by the problem, resulting in errors in the recognition results", it has better accuracy in extracting misplaced information in standard images

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting fault-tolerant information of contract document based on graph attention network
  • Method for extracting fault-tolerant information of contract document based on graph attention network
  • Method for extracting fault-tolerant information of contract document based on graph attention network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] In the following, only some exemplary embodiments are briefly described. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the embodiments of the present application. Accordingly, the drawings and descriptions are to be regarded as illustrative in nature and not restrictive.

[0072] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0073] The embodiment of the present application provides a method for extracting fault-tolerant information of contract documents based on graph attention network, which is used to extract effective information from standard image data formed by contract documents.

[0074] Such as figure 1 As shown, an embodiment of the contract document fault-tolerant information extraction method based on graph attention network provided by this application includes OCR pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a contract document fault-tolerant information extraction method based on a graph attention network, and relates to the technical field of computers and information processing. The method comprises the following steps: firstly, performing character recognition on a contract through an OCR (Optical Character Recognition) engine to obtain text content and corresponding position coordinates; text information features are extracted, wherein the text information features comprise position vectors of text information and word embedding expressions of text character strings; taking features extracted from the contract document as graph node features, and constructing a fault-tolerant contract text relation graph; then setting each layer structure and an activation function of the graph attention network; inputting the training set into the constructed graph attention network for training until the loss function converges; and finally, modeling a contract to be recognized into a text relation graph, inputting the text relation graph into the trained graph attention network, and finally obtaining the category of text information. According to the method, the dislocation information extraction of the contract document is realized, and the method has higher recognition efficiency and accuracy compared with the existing information extraction technology after OCR, and is beneficial to office intelligentization.

Description

technical field [0001] The invention relates to the technical field of computer and information processing, in particular to a method for extracting fault-tolerant information of a contract document based on a graph attention network. Background technique [0002] With the development of network and computer technology, computer intelligent algorithms have been widely used in Internet finance, Internet government affairs and other fields as business auxiliary technology. Among them, optical character recognition (OCR), as the core key technology, has played a pivotal role: commercial banks, insurance and other financial industries often use OCR technology to realize automatic recognition of receipts, invoices or contract contents, so as to avoid cumbersome work by staff. Input operation to improve work efficiency and improve user experience; in the field of Internet + government services, using COR technology to identify key information of proof materials such as house purch...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V30/148G06V10/774G06F16/35
Inventor 高菱范攀
Owner 四川国路安数据技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products