Table identification method and device

A recognition method and table technology, applied in the computer field, can solve problems such as table layout changes, and achieve the effect of strong adaptability, wide application range, and overcoming inherent defects.

Active Publication Date: 2020-04-24
TAIKANG LIFE INSURANCE CO LTD
View PDF6 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, 1) and 2) can still be avoided through image preprocessing, but 3), 4), 5), 6), and 7) involve table layout changes, which have been difficult to solve through image processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Table identification method and device
  • Table identification method and device
  • Table identification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0032] The difficulty of table recognition lies in the recognition of table layout and the determination of cell text data. Most of the existing technologies detect the layout structure of the table by detecting the position of the row, column and frame line of the table. However, due to the following factors, the accuracy of this detection is low. The case where the fram...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a table identification method and device, and relates to the technical field of computers. A specific embodiment of the method comprises the steps of identifying characters ina to-be-detected image, and generating text lines by the identified characters according to longitudinal position information of the characters; performing word segmentation processing on the text lines, determining categories to which the text lines belong in a table line attribute dimension and/or a table content dimension according to a word segmentation result, and acquiring a plurality of text lines belonging to the same table by utilizing the determined text line categories; and determining a column separation line shared by the plurality of text lines to realize table identification. According to the embodiment, the format structure of the table and the text data in the table cells can be accurately recognized.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a form recognition method and device. Background technique [0002] With the advancement of paperless office, and the requirements of business processes and regulatory authorities for electronic archiving of customer data, paper documents that previously existed in the form of printing and copying are now usually entered into the office in the form of scanned or photographed digital images. information system, thus accumulating a large amount of digital image data. The text content contained in these digital images cannot be directly processed by the information system, and needs to be recognized by an Optical Character Recognition (OCR) system as computer character data before it can be processed by the information system. However, for text content organized in the form of a table, the OCR system can only recognize character data one by one or further recognize word data, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06F40/289G06K9/62
CPCG06V30/412G06V30/416G06V30/10G06F18/23213
Inventor 刘亚宋慧驹刘兴旺刘岩
Owner TAIKANG LIFE INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products