Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Table recognition method, model training method and device, and equipment

A recognition method and recognition model technology, applied in character and pattern recognition, instruments, calculations, etc., can solve problems such as easy missing content, table layout restrictions, inability to effectively recognize table cross-page breaks, etc., to achieve the effect of improving recognition accuracy

Pending Publication Date: 2021-10-22
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, due to the typesetting restrictions of the table on the page, it often happens that a complete table is truncated and laid out on multiple pages
Among them, the text information in some tables will break across pages, that is, the text content originally belonging to the same cell will be scattered in two cells on different pages. It is easy to miss the content when reading, which will affect the user experience.
In related technologies, the identification of tables often only distinguishes whether the layout on different pages is the same table, and the method generally used is to compare the layout characteristics of the cells of the corresponding pages (such as the number and length and width of the cells, etc.), but This method cannot effectively identify the situation where the table breaks across pages

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Table recognition method, model training method and device, and equipment
  • Table recognition method, model training method and device, and equipment
  • Table recognition method, model training method and device, and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] Embodiments of the present application are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary, and are only for explaining the present application, and should not be construed as limiting the present application. For the step numbers in the following embodiments, it is only set for the convenience of illustration and description, and the order between the steps is not limited in any way. The execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art sexual adjustment.

[0056] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a table recognition method, a model training method and device and equipment. The table recognition method comprises the following steps: when cross-page broken line recognition is performed on a table, obtaining first text information from a first cell in a first page; determining a second cell corresponding to the first cell in the second page; then obtaining second text information from the second cell; performing natural language analysis on the first text information and the second text information; and when the analysis result is that the first page and the second page belong to the context statements, determining that the table has cross-page broken lines between the first page and the second page. According to the table recognition method, whether the table is truncated between the two pages or not is recognized by judging whether the text information of the table in the cells of the two pages belongs to the context statements or not, and the recognition precision of cross-page broken lines of the table can be effectively improved based on the logical coherence of the original text information of the truncated cells. The invention can be widely applied to the technical field of artificial intelligence.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a table recognition method, a model training method, a device and equipment. Background technique [0002] Since entering the information age, the means of processing information have become increasingly diverse. Among them, the table is an information processing method with strong visibility and regular layout, which has the advantages of clear logic and easy reading, and the tabular text information is easy to extract and focused, and is more suitable for large-scale analysis and processing. Therefore, tables have been widely used in various industries. [0003] However, due to the typesetting limitation of the table on the page, it often happens that a complete table is truncated and laid out on multiple pages respectively. Among them, the text information in some tables will break across pages, that is, the text content originally belonging to the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06F40/289
CPCG06F40/289
Inventor 朱龙军
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products