Form recognition method, recognition system and computer device

A recognition method and table technology, applied in computer parts, calculation, character and pattern recognition, etc., can solve a lot of human, material, financial and time, complex structure, time-consuming and other problems

Active Publication Date: 2018-12-25
国科赛思(北京)科技有限公司
View PDF0 Cites 37 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method generally has poor recognition accuracy for poor-quality images or scanned documents with noise, and it is time-consuming
In addition, if Chinese characters are recognized by training a neural network, due to the large number of Chinese characters and the complex structure, this solution will require a lot of manpower, material resources, financial resources and time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Form recognition method, recognition system and computer device
  • Form recognition method, recognition system and computer device
  • Form recognition method, recognition system and computer device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0086] Such as figure 1 As shown, in the form recognition method provided by Embodiment 1 of the present invention, firstly, the format of the input document is discriminated, and if it is a PDF file, the PDF file is converted into a picture in JPG format by the format conversion module, and saved. Then use the nonlinear contrast enhancement based on weighted RC threshold iteration and the LoG operator binarization method to convert the RGB image into a binary image and save it. Then use the tilt correction algorithm based on perspective changes to correct the tilt of the image according to the selected four perspective corner points. At the same time, the frame line of the table is extracted by using the method of image morphology processing, and each cell is segmented. Finally, combined with the characteristics of the form's application field, a proprietary character database is established, and a customized neural network is trained to recognize characters.

[0087] In pr...

Embodiment 2

[0183] Such as Figure 11 As shown, Embodiment 2 of the present invention provides a method for character recognition by training a dedicated neural network. Specifically,

[0184] First, count the high-frequency characters and character strings contained in the form to be recognized in the proprietary domain, and collect character pictures as a sample set for the neural network. Then, binarize the picture, segment each character at the same time, and standardize it to unify the format and size of the picture. Next, feature extraction is performed on the preprocessed image, and character structure point features, character projection features, etc. are extracted. Finally, according to ten-fold cross-training, train the network, and use the tuned network to recognize characters, calculate the edit distance between it and each string in the string database according to the recognition results, and compare the size relationship between the minimum edit distance and the credibil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a table recognition method and system, belonging to the technical field of table recognition, which utilizes nonlinear contrast enhancement based on weighted RC threshold iteration and Gaussian Laplacian LoG operator to carry out binarization processing on table images conforming to the format, and utilizes tilt correction algorithm based on perspective change to carry out tilt correction; uses an image morphological processing method to extract table box lines, segmentation of cells, to obtain the smallest cell; establishes the character database of the smallest cell, carries on the neural network training, establishes the form recognition model, carries on the recognition to the form. The invention has the advantages of simple calculation, fast speed, accurate identification of table images with weak contrast, uneven light and dark distribution of images and blurred background. The speed and precision of recognition are improved by establishing special high-frequency characters, training special neural network and template matching. At the same time, the customized neural network is simple in structure and reduces the time and workload of training and tuning.

Description

technical field [0001] The invention relates to the technical field of form image recognition processing, in particular to a form recognition method and a recognition system that are simple in calculation, fast in operation, low in time and space costs, and can accurately recognize forms with weak contrast, uneven distribution of image light and shade, and blurred background and computer equipment. Background technique [0002] When the existing OCR technology is used for form recognition, the main technical means used when performing binary processing on images generally include: global threshold method, local threshold method, region growth method, waterline algorithm, minimum description length method , methods based on Markov random fields, etc. There are various defects in the above-mentioned image binarization processing method. For example, the global threshold method only considers the grayscale information of the image, but ignores the spatial information in the i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/34
CPCG06V30/40G06V30/414G06V30/153G06V10/267
Inventor 李自豪
Owner 国科赛思(北京)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products