CRNN-based picture table extraction method

An extraction method and table technology, applied in the field of image table extraction based on CRNN, can solve the problems of unclear tables, time-consuming and labor-intensive, and increase the difficulty of character cutting, and achieve the effect of improving user experience, high recognition accuracy, and improving efficiency.

Pending Publication Date: 2021-07-20
康旭科技有限公司
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) In the existing technology, the connected domain analysis is generally used in the extraction of the picture table skeleton, but the content around the table is connected to the edge of the table, or the table in the picture is not clear, and the light, tilt, blur, etc. of the picture will affect the connected domain analysis As a result, a large number of people are required to manually adjust the parameters, which is time-consuming and labor-intensive;
[0006] (2) Existing text recognition technology usually builds a single-character recognition model. The single-character recognition model needs to cut the character sequence in the cell. Printed fonts generally use the projection method, but there are consecutive strokes in handwriting, which increases the difficulty of character cutting. , the design of the character cutting algorithm is complex;
[0007] (3) The recognition accuracy of handwritten text numbers is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CRNN-based picture table extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] See figure 1 The present invention provides a technical solution: a method based on a CRNN-based picture extraction method, including the following steps:

[0045] S1, treating the picture to make perspective transformation, calibrating the picture, effectively keep the straight line does not deform, and there is more adaptability to the input image;

[0046] The specific steps of the picture correction are as follows:

[0047] S11, use cv2.findcontours () to detect the design of the picture form cell profile and take the maximum form of form, that is, four vertex coordinates;

[0048] S12, by detecting the reference picture and the vertex to be detected, the conversion matrix M is obtained by the cv2.getPertiveTransform () of the OpenCV-API;

[0049] S13, calculates an image after tilt correction by cv2.warpperspective () by OpenCV-API;

[0050] S2, using the depth neural network to correct the corrected picture to make form skeleton extraction, effectively reduce human-or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a CRNN-based picture table extraction method, and the method comprises the following steps: S1, carrying out the perspective transformation of a to-be-detected picture, and correcting the picture; S2, performing table skeleton extraction on the corrected picture by using a deep neural network; S3, obtaining a cell ROI (Region of Interest) from the table skeleton; S4, recognizing text contents in all the cell ROIs through an OCR recognition model; and S5, restoring the text content to a table through the table skeleton typesetting in the step S2, so that the picture table is converted into a data table, and the extraction of the picture table is completed. According to the method, perspective transformation is carried out on the to-be-recognized picture once, the picture angle is corrected, and then the deep neural network model is used for extracting the overall table skeleton, so that the situation that the edges of cells are connected by handwritten characters in the prior art or the problems that a table in the picture is not clear, the picture is light, inclined and fuzzy are solved, and the problem that time and labor are wasted because a large amount of manual parameter adjustment is needed is solved.

Description

Technical field [0001] The present invention relates to the field of image form extraction, and in particular, to a CRNN-based picture form extraction method. Background technique [0002] In the artificial intelligence era, AI technology has developed two major people who are close to people's lives, natural language processing and image identification technology, where image recognition technology is especially important in all walks of life, with a very important influence in which form is image text. One of the very important ways, is the basic tool in various data analysis tools, form expressions are very common in network data, and many of them provide downloads in the format of pictures, such as various scan files. , PDF file. [0003] If you want to quickly process and analyze these data, you need to automatically identify these image forms, extract and restore the graphics of the picture form content, usually contain multiple steps, typical in a complete image form extra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/20G06K9/32G06K9/46G06N3/04G06N3/08
CPCG06N3/084G06V30/412G06V10/22G06V10/25G06V10/44G06N3/045
Inventor 励建科许化顾淼陈再蝶朱晓秋樊伟东章星星
Owner 康旭科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products