Table information extraction model training method and device

A form information and form technology, applied in character and pattern recognition, instruments, electrical digital data processing, etc., can solve the problems of heavy development tasks, low efficiency of form information extraction, and low value of algorithm reuse, so as to reduce training time, The effect of reducing training samples and high iteration efficiency

Pending Publication Date: 2021-11-30
SHANGHAI CLOUDWALK HUILIN ARTIFICIAL INTELLIGENCE TECH CO LTD
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the lack of generalization of the rule algorithm, the corresponding rule algorithm needs to be redeveloped for each type of table and different extraction requirements, whic

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Table information extraction model training method and device
  • Table information extraction model training method and device
  • Table information extraction model training method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0062] Such as figure 1 As shown, the present embodiment provides a training method for a tabular information extraction model, including:

[0063] Step S110: Process the cells of the tabular corpus to obtain the feature vectors of the cells;

[0064] Step S120: Calculate the adjacency matrix according to the position information of the cell, perform feature extrac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a table information extraction model training method, which comprises the following steps: processing table cells of table corpus to obtain feature vectors of the table cells; calculating an adjacent matrix according to the position information of the cells, and performing feature extraction on the feature vectors of the cells and the adjacent matrix to obtain high-order feature vectors of the cells; predicting an original text of the cell by using the high-order feature vector, and performing model training by using the text of the cell to obtain a table language model; and training the table language model by using the training sample corresponding to the current table information extraction task to obtain a table information extraction model. For each different table extraction task, only a small number of training samples corresponding to different tasks need to be used for training to obtain the corresponding table information extraction model on the basis of the trained table language model, and the table language model does not need to be trained in each training, so that the training time and the training samples are obviously reduced.

Description

technical field [0001] This application relates to the technical field of artificial intelligence, in particular to a training method and device for a form information extraction model. Background technique [0002] With the increasing popularity of information technology, the demand for extracting information from tables has become increasingly prominent. Currently, rule algorithms are mainly used to extract information from tables. However, due to the lack of generalization of the rule algorithm, the corresponding rule algorithm needs to be redeveloped for each type of table and different extraction requirements, which not only makes the development task heavy, the algorithm reuse value is low, but also leads to different extraction task adaptability Poor, the extraction efficiency of table information is low. Contents of the invention [0003] The present application provides a training method and device for a form information extraction model to solve one or more tec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06K9/46G06K9/62G06F40/126G06F40/18
CPCG06F40/126G06F40/18G06F18/253
Inventor 李彦达郝东
Owner SHANGHAI CLOUDWALK HUILIN ARTIFICIAL INTELLIGENCE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products