Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Character image serialization identification and structured data output method

A technology of structured data and output methods, applied in character and pattern recognition, neural learning methods, neural architecture, etc., can solve problems such as weak anti-interference ability, low efficiency, unrecognizable pictures, etc., to improve the recognition rate and reduce storage. The effect of space, good recognition accuracy and robustness

Inactive Publication Date: 2019-05-07
SUNYARD SYST ENG CO LTD
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Traditional document OCR recognition technology has weak anti-interference ability, cannot recognize pictures under complex background conditions, and has low efficiency when outputting "line text information" and "column text information"

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character image serialization identification and structured data output method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be described in detail below according to the accompanying drawings and preferred embodiments, and the purpose and effect of the present invention will become clearer. The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] Such as figure 1 As shown, a text image serialization recognition, structured data output method, the method includes the following steps:

[0025] S1: Obtain multiple text image blocks;

[0026] S2: Use a full-depth convolutional neural network (the deep neural network is a double-layer recurrent neural network) to perform text image feature extraction on each text image block, and express each text image block as a feature vector; specifically, an image of any size As input, output a respons...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a character image serialization identification and structured data output method. The method specifically comprises the steps of obtaining a plurality of character image blocks; Performing text image feature extraction on each character image block by using a full-depth convolutional neural network, and expressing each character image block as a feature vector; Processing the feature vector by adopting a deep neural network, and outputting probability distribution about a character set; Adopting a connection time classification layer as a transcriptional layer, and outputting a computer readable text by using a dynamic programming algorithm of forward calculation and reverse gradient propagation on probability distribution related to a character set; And carrying out error correction on the computer readable text by adopting a language model to obtain structured data and output the structured data. The method is high in recognition accuracy, good in robustness and high in recognition rate.

Description

technical field [0001] The invention relates to the technical field of image recognition in computer software, in particular to a method for character image serialization recognition and structured data output. Background technique [0002] OCR-based text area detection, positioning and recognition technology in the financial field refers to the use of computers and other equipment to automatically extract and recognize valid information in paper materials using OCR technology (optical character recognition), and perform corresponding processing. It is one of the key technologies for computer automatic processing to realize paperless banking. In the OCR of the financial industry, text often appears in the form of sequences, rather than in isolation. Traditional document OCR recognition technology has weak anti-interference ability, cannot recognize pictures under complex background conditions, and has low efficiency when outputting "line text information" and "column text i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/32G06K9/34G06K9/46G06K9/62G06N3/04G06N3/08
Inventor 雷钧林路林康王慜骊安通鉴
Owner SUNYARD SYST ENG CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products