A method and a device for analyzing a table wrapping and a page wrapping

An analysis method and table technology, applied in the field of recognition, can solve the problem of difficulty in judging line breaks or non-line breaks.

Active Publication Date: 2019-03-12
上海犀语科技有限公司
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, when there is a page break or line break, it is difficult to judge the line break or non-line break simply by the separator line or s...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and a device for analyzing a table wrapping and a page wrapping
  • A method and a device for analyzing a table wrapping and a page wrapping

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Such as figure 1 As shown, the present invention provides a kind of analytical method of table wrapping and page changing, comprises the following steps:

[0027] Step 1. Judgment of clear line breaks and page breaks based on expert experience summary rules.

[0028] In step 1, a clear line break is judged by the entire date composed of the left bracket contained above the two paragraphs of text, the right bracket contained below, and the upper and lower two paragraphs of text.

[0029] Step 2. Use the deep learning model to obtain the labeled corpus.

[0030] In step 2, the acquired annotation corpus includes the semantic information of two adjacent rows and the associated cell information in the table.

[0031] Step 3. Determine whether two adjacent cells can be merged according to the labeled corpus and by training a deep learning language model.

[0032] Step 4, verifying the merged cell information to improve the accuracy of judgment.

[0033] Such as figure ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an analysis method for table line and page changing, which comprises the following steps: judging the definite line and page changing situation through the expert experience summary rule; The tagged corpus is obtained by using depth learning model. According to the tagged corpus and the training depth learning language model, it is judged whether two adjacent cells can be merged or not. The device for implementing the method comprises: a wrapping and page-changing condition judging module for judging a definite wrapping and page-changing condition through an expert experience summary rule; and a wrapping and page-changing condition judging module for judging a definite wrapping and page-changing condition. The tagged corpus acquisition module is used for acquiring the tagged corpus acquisition module of the tagged corpus by using the depth learning model. A cell merge judging module is used for judging whether two adjacent cells can be merged according to the tagged corpus and the training depth learning language model. The invention utilizes a depth learning model to mine semantic information contained in a table, and can accurately analyze whether two adjacent cells can be merged in a line and page changing scene.

Description

technical field [0001] The invention relates to an identification method, in particular to an analysis method and device for line-changing and page-changing of tables. Background technique [0002] In recent years, deep learning technology has been widely used in many fields such as natural language processing, graphics and images, and automatic driving, and its performance is significantly better than traditional methods. [0003] In the field of natural language processing, deep learning technology can capture deep-level grammatical and semantic information by encoding text in high-dimensional space, thus providing a technical basis for further advanced applications in the field of natural language processing starting from semantics. [0004] In text information processing, there are a large number of tables with different styles. There are still many problems with current technology for the extraction of tabular information. For example, when there is a page break or li...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06F16/332G06F16/36
CPCG06V30/412
Inventor 李鹏辉竺晨曦邱锡鹏
Owner 上海犀语科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products