OCR (Optical Character Recognition) result processing method and device, equipment and storage medium

A technology of recognition results and processing methods, which is applied in character and pattern recognition, instruments, calculations, etc., and can solve problems such as cost, recognition result dislocation, and low efficiency

Pending Publication Date: 2022-05-17
上海微问家信息技术有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The result recognized by the existing OCR technology is only a string of editable strings and the corresponding text box and coordinate position of the string, which does not contain any structural information. For the recognition results, it is often necessary to establish a series of rules to filter items so that input, or direct manual input; the former is very poor in robustness, and there is currently no set of fully effective rules for screening various information, which is prone to misalignment of the recognition results; the latter is inefficient and requires a great deal of time labor costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • OCR (Optical Character Recognition) result processing method and device, equipment and storage medium
  • OCR (Optical Character Recognition) result processing method and device, equipment and storage medium
  • OCR (Optical Character Recognition) result processing method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] This embodiment provides a method for processing OCR recognition results, such as figure 1 shown, including the following steps:

[0050] S101. Acquire an OCR recognition result of a target image, where the OCR recognition result includes several text boxes, text content in each text box, and coordinate information of each text box.

[0051] During specific implementation, the text recognition of the target image is carried out through the OCR recognition technology, and the OCR recognition result of the target image is obtained. The OCR recognition result includes several text boxes and the text content in each text box, and the coordinate information of each text box. The information is the plane coordinate information including the X-axis coordinate value and the Y-axis coordinate value. The coordinate information of the text box includes the coordinates of a pair of diagonal points in the text box. The two diagonal points can be selected from the upper left corner a...

Embodiment 2

[0073] This embodiment provides an OCR recognition result processing device, such as figure 2 As shown, the device includes an acquisition unit, a classification unit, a determination unit, a first calculation unit, a second calculation unit, an extraction unit and a display unit, wherein:

[0074] An acquisition unit, configured to acquire an OCR recognition result of the target image, the OCR recognition result including several text boxes and text content in each text box, and coordinate information of each text box;

[0075] The classification unit is used to import the text content in each text box into the trained text recognition model to obtain the classification result of each text content;

[0076] The determination unit is configured to determine that each text content is the first type of text or the second type of text according to the classification result, and determine that the text frame corresponding to the first type of text is the first type of text frame,...

Embodiment 3

[0083] This embodiment provides an OCR recognition result processing device, such as image 3 As shown, at the hardware level, including:

[0084] memory for storing instructions;

[0085] The processor is configured to read the instructions stored in the memory, and execute the OCR recognition result processing method described in Embodiment 1 according to the instructions.

[0086] Optionally, the computer device also includes an internal bus and a communication interface. The processor, the memory and the communication interface can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture, industry standard architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus or an EISA (Extended Industry Standard Architecture, extended industry standard architecture) bus, etc. The bus can be divided into address bus, data bus, control bus and so on.

[0087] The memory may include,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of OCR recognition processing, in particular to an OCR recognition result processing method and device, equipment and a storage medium. According to the method, the OCR recognition result of the target image is imported into the corresponding text recognition model for classification and recognition, and the first type of text, the second type of text, the corresponding first type of textbox and the corresponding second type of textbox are judged; first center point coordinates or second center point coordinates of each textbox are determined, and the nearest second center point coordinates corresponding to the first center point coordinates are obtained through calculation by adopting a proximity algorithm; by extracting first-class texts corresponding to the first center point coordinates and second-class texts corresponding to the second center point coordinates, the first-class texts and the associated second-class texts are paired, output and displayed; according to the method, the OCR recognition result of the target image can be subjected to efficient text classification extraction, and the classified texts are subjected to accurate association matching, so that the mutually associated text contents are output and displayed in a structured manner.

Description

technical field [0001] The present invention relates to the technical field of OCR recognition processing, in particular to a method, device, equipment and storage medium for processing an OCR recognition result. Background technique [0002] OCR (Optical Character Recognition, Optical Character Recognition) technology mainly recognizes text in an image as an editable character string. The early OCR technology mainly recognized some simple document images. Due to the development of deep learning, the current OCR technology has been widely used in image text recognition in various complex scenes. [0003] At present, the text recognition results of form images are line by line of text and corresponding coordinate positions. For these unstructured results, it is necessary to provide additional corresponding rules according to various application scenarios for editing and sorting before obtaining structured display results. . The result recognized by the existing OCR technolo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06V30/40G06V30/413G06V30/418G06V30/16G06V10/764G06K9/62
CPCG06F18/24143
Inventor 杨沅霖孙安国
Owner 上海微问家信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products