Method and equipment for generating OCR image recognition model training data

An image recognition and model training technology, applied in character recognition, character and pattern recognition, instruments, etc., can solve the problem of low OCR image recognition accuracy

Active Publication Date: 2021-03-16
SHANGHAI ZHANWAN INFORMATION SCI & TECHCO LTD
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of this application is to provide a method and equipment for OCR image recognition model training data generation to solve the technical problem of low OCR image recognition accuracy in the existing industrial control field

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for generating OCR image recognition model training data
  • Method and equipment for generating OCR image recognition model training data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0066] In a typical configuration of the present application, each module and trusted party of the system includes one or more processors (CPU), input / output interface, network interface and memory.

[0067] Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and / or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.

[0068] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynam...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for generating OCR image recognition model training data. The method comprises the steps: obtaining an OCR image, wherein the OCR image comprises names and parameter values of one or more parameters; cutting the OCR image into a plurality of parameter pictures based on the parameters; identifying each character in each parameter picture; based on a presetcharacter database and a preset annotation database, sequentially splicing the character pictures corresponding to the characters, and combining the annotation sequences of the character pictures toobtain a spliced picture of the parameter picture and annotations of the spliced picture; traversing each parameter picture, sequentially splicing the obtained spliced pictures to determine the spliced picture corresponding to the OCR image, and sequentially combining the obtained annotation sequences of the spliced pictures to determine the annotation of the spliced picture corresponding to the OCR image; and determining the spliced picture corresponding to the OCR image and the label of the spliced picture as the training data of the OCR image recognition model. Through the method, high-quality training data can be obtained.

Description

technical field [0001] The present application relates to the technical field of computer image processing, in particular to a technique for generating training data for an OCR image recognition model. Background technique [0002] OCR image recognition technology is to obtain text and image information on paper through optical input methods such as scanning and photography, and use various pattern recognition algorithms to analyze text morphological characteristics, and can convert bills, newspapers, books, manuscripts and other printed materials into text and image information. Then use image recognition technology to convert text and image information into usable computer input technology. Usually, the final recognition rate, recognition speed, correct rate of layout understanding and satisfaction of layout restoration are used as the evaluation basis of OCR image recognition technology. [0003] Most of the core algorithms of the existing OCR image recognition technolog...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/20G06K9/36G06K9/62
CPCG06V10/22G06V10/20G06V30/10G06F18/214
Inventor 唐栎谢利如
Owner SHANGHAI ZHANWAN INFORMATION SCI & TECHCO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products