An optical character recognition error correction method based on natural language recognition

A technology of optical character recognition and natural language, applied in the field of image text recognition, can solve the problem of no error data inspection and correction, and achieve the effect of overcoming the difficulty of determining the result of isolated characters and words

Pending Publication Date: 2019-04-05
SUNYARD SYST ENG
View PDF15 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing technical solutions do not use their

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An optical character recognition error correction method based on natural language recognition
  • An optical character recognition error correction method based on natural language recognition
  • An optical character recognition error correction method based on natural language recognition

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0036] The following describes the present invention in detail based on the accompanying drawings and preferred embodiments. The purpose and effects of the present invention will become more apparent. The following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

[0037] Such as figure 1 As shown, a dictionary-based optical character recognition error correction method is characterized in that the method includes the following steps:

[0038] S1: Obtain text images;

[0039] S2: The text image is recognized by OCR to obtain an initial recognition result;

[0040] S3: Build a dictionary;

[0041] The dictionary here comes from the common word list of the search engine bing, which contains 1,000,000 keywords frequently used by users of the search engine, and is provided...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an optical character recognition error correction method based on natural language recognition, and the method comprises the steps: carrying out the fusion of a lexical analysis model and a semantic analysis model, obtaining a fusion model, and obtaining a high-precision optical character recognition result through employing the fusion model. According to the model, the characteristics of Chinese characters in a lexical model are considered, and meanwhile, significant characteristics such as context relations of Chinese syntactic semantics are considered to correct optical character recognition results, so that the model precision is improved.

Description

technical field [0001] The invention relates to the field of image character recognition, in particular to an optical character recognition error correction method based on natural language recognition. Background technique [0002] OCR-based text area detection, positioning and recognition technology in the financial field refers to the use of computers and other equipment to automatically extract and recognize valid information in paper materials using OCR technology (optical character recognition), and perform corresponding processing. It is one of the key technologies for computer automatic processing to realize paperless banking. The traditional image text recognition is Optical Character Recognition (OCR), which recognizes on the basis of scanning paper documents to be recognized into electronic images. However, considering the quality of the scanning effect, the quality of the paper document itself (such as printing quality, font clarity, font standardization, etc.),...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06K9/34
CPCG06F40/232G06F40/289G06F40/30G06V10/26G06V30/153Y02D10/00
Inventor 林康林路王慜骊安通鉴雷钧
Owner SUNYARD SYST ENG
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products