Method and equipment for generating OCR image recognition model training data

An image recognition and model training technology, applied in character recognition, character and pattern recognition, instruments, etc., can solve the problem of low OCR image recognition accuracy
CN112508000AActive Publication Date: 2021-03-16SHANGHAI ZHANWAN INFORMATION SCI & TECHCO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANGHAI ZHANWAN INFORMATION SCI & TECHCO LTD
Publication Date
2021-03-16

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

The invention provides a method and device for generating OCR image recognition model training data. The method comprises the steps: obtaining an OCR image, wherein the OCR image comprises names and parameter values of one or more parameters; cutting the OCR image into a plurality of parameter pictures based on the parameters; identifying each character in each parameter picture; based on a presetcharacter database and a preset annotation database, sequentially splicing the character pictures corresponding to the characters, and combining the annotation sequences of the character pictures toobtain a spliced picture of the parameter picture and annotations of the spliced picture; traversing each parameter picture, sequentially splicing the obtained spliced pictures to determine the spliced picture corresponding to the OCR image, and sequentially combining the obtained annotation sequences of the spliced pictures to determine the annotation of the spliced picture corresponding to the OCR image; and determining the spliced picture corresponding to the OCR image and the label of the spliced picture as the training data of the OCR image recognition model. Through the method, high-quality training data can be obtained.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present application relates to the technical field of computer image processing, in particular to a technique for generating training data for an OCR image recognition model. Background technique

[0002] OCR image recognition technology is to obtain text and image information on paper through optical input methods such as scanning and photography, and use various pattern recognition algorithms to analyze text morphological characteristics, and can convert bills, newspapers, books, manuscripts and other printed materials into text and image information. Then use image recognition technology to convert text and image information into usable computer input technology. Usually, the final recognition rate, recognition speed, correct rate of layout understanding and satisfaction of layout restoration are used as the evaluation basis of OCR image recognition technology.

[0003] Most of the core algorithms of the existing OCR image recognition technolog...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More