Unlock instant, AI-driven research and patent intelligence for your innovation.

OCR identification method based on template matching

A recognition method and template matching technology, applied in the field of image recognition, can solve problems such as difficulty in character recognition, huge model files, and large amount of model calculations, and achieve high recognition efficiency and reliability, calculation process method, and simple calculation process Effect

Inactive Publication Date: 2017-05-17
成都数联铭品科技有限公司
View PDF8 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the text in the image is glued together and the image contains Chinese characters with a left-right structure, it is difficult to achieve a good segmentation effect with a simple projection method; it is for this reason that segmentation has always been a difficulty in OCR recognition , the quality of segmentation will directly affect the text recognition effect
[0005] In addition, the main function of optical character recognition software is to recognize characters in photographed and scanned pictures. For scanned copies of some special fonts, official seals, photographs, such as early printed books, certificates made by government units, etc., due to historical reasons and confidentiality and security needs , the fonts are often specially made, the existing optical character recognition software mainly focuses on machine learning methods, the model has a large amount of computation, and because the training font samples do not cover the special fonts, the recognition accuracy of special fonts is not high, serious Influencing the digitization of paper documents
[0006] Most of the existing technologies use neural network machine learning algorithms to recognize characters, which requires the production of a large number of samples and takes a lot of time for training, and the generated model files are very large, and the recognition rate is not the same for characters of different fonts. The recognition rate of some special font characters is relatively low, and it is difficult to meet the character recognition in some special scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • OCR identification method based on template matching
  • OCR identification method based on template matching
  • OCR identification method based on template matching

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0083] For example, when recognizing text in an image, after observation, it is considered that the text in the image is closer to the font of Microsoft Yahei. After binarizing the text in the image, the text in the image is segmented into lines through line projection Come out, perform column projection on each row of text images, find the initial segmentation point, and perform preliminary segmentation on the text image according to the initial segmentation to form a sub-image, and use the following rules to extract the numbers, letters and punctuation characters. The rule can be selected as follows: the width of the sub-picture L<0.4h row height, after judging and marking the sub-pictures of numbers, letters and punctuation (the mark here only marks the type of the sub-picture, and does not specifically identify it), for the same A number, letter and punctuation only select one of the sub-pictures for template preparation (picture selection can be manually selected, numbers:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of image identification and processing, in particular to an OCR identification method based on template matching. Images and words to be identified are divided into sub-images only containing single characters, wherein figures, letters, punctuation marks and character sub-images therein are respectively marked out; according to corresponding font making characteristic images are selected according to image characters to be identified; normalization processing is conducted on the sub-images to be identified and corresponding types of characteristic images. On the basis, characteristic comparison is performed, contrastive calculation is conducted on the figure, letter and punctuation mark sub-images to be identified and alphabetic character sub-images on corresponding types of characteristic templates respectively, and an XOR algorithm of the same pixel position is adopted during contrastive calculation, and error frequency statistics is performed; marks corresponding to characteristic images having the lowest error frequency serve as identification results to be output. The method achieves image identification, the calculation process is simple, human and material resources are saved, and the identification efficiency is higher.

Description

technical field [0001] The invention relates to the field of image recognition, in particular to an OCR recognition method based on template matching. Background technique [0002] With the development of society and the advancement of science and technology, the amount of knowledge created by humans is increasing exponentially. Before the emergence of electronic books, most of the knowledge was passed down in the form of books. China has produced a large number of excellent Books, these books have been damaged to varying degrees in the long river of history, so it is imminent to digitally store these books; Due to the large number of books and the fact that the early printed books did not have the author's electronic manuscripts, it is necessary to digitize paper books. [0003] Optical character recognition software is a powerful tool for converting paper books to electronic documents. It mainly uses a large number of character samples to generate corresponding model file...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/34G06K9/00
CPCG06V30/40G06V10/267G06V10/751G06V30/10G06F18/22G06F18/214
Inventor 景亮康青杨唐涔轩刘世林
Owner 成都数联铭品科技有限公司