Tibetan language recognition method and device and electronic device

A recognition method, text recognition technology, applied in the field of optical character recognition, can solve the problem of low accuracy of image recognition

Active Publication Date: 2019-07-19
BEIJING HANVON DIGITAL TECH CO LTD +1
View PDF8 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] It can be seen that the Tibetan text recognition method in the prior art has at least the defect of low recognition accuracy for Tibetan text line images with poor image quality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tibetan language recognition method and device and electronic device
  • Tibetan language recognition method and device and electronic device
  • Tibetan language recognition method and device and electronic device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] This embodiment provides a Tibetan recognition method, such as figure 1 As shown, the method includes: Step 10 to Step 12.

[0066] Step 10, determining overlapping information of target image blocks distributed sequentially in the text line image to be recognized and the target image blocks.

[0067] There are a large number of resources such as documents, ancient books, and scriptures recorded in Tibetan, and these Tibetan resources are generally preserved in the form of woodblocks. Due to the long age and the characteristics of the woodcut text itself, the Tibetan images of the woodcut boards obtained by taking photos or scanning have the following characteristics: poor image quality, blurred text, large noise interference, etc.; Glued characters; syllables and single vertical characters are relatively narrow, easy to glue with adjacent characters, because their width is smaller than other characters, it is difficult to recognize after glued. Therefore, if the Tibe...

Embodiment 2

[0111] This embodiment provides a Tibetan recognition method, such as Figure 4 As shown, the method includes: step 40 to step 46.

[0112] Step 40, determining the target image blocks distributed sequentially in the text line image to be recognized and the overlapping information of each target image block.

[0113] For a specific implementation manner of determining the sequentially distributed target image blocks in the text line image to be recognized and the overlapping information of each target image block, refer to Embodiment 1, and details will not be repeated in this embodiment.

[0114] Step 41: Recognize each of the above-mentioned target image blocks by using a preset first text string recognition model, and determine the text recognition results of each of the above-mentioned target image blocks.

[0115] For a specific implementation manner of identifying each of the above-mentioned target image blocks by presetting the first text string recognition model, and ...

Embodiment 3

[0151] Correspondingly, this application also discloses a Tibetan recognition device, such as Figure 6 As shown, the device includes:

[0152] The target image block information determination module 60 is used to determine the overlapping information of the target image blocks distributed in sequence in the text line image to be recognized and the target image blocks;

[0153] The target image block identification module 61 is used to identify each of the target image blocks by presetting the first text string recognition model, and determine the text recognition result of each of the target image blocks;

[0154] The recognition result integration module 62 is configured to integrate the text recognition results of each target image block according to the overlapping information of the target image block, and determine the text recognition result of the to-be-recognized text line image.

[0155] optional, such as Figure 7 As shown, the target image block information deter...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Tibetan language recognition method, which belongs to the technical field of optical character recognition, and solves the problem of low Tibetan language recognition accuracyin the prior art. The method comprises the steps of determining the target image blocks which are sequentially distributed in a text line image to be identified and the overlapping information of thetarget image blocks; identifying each target image block through a preset first text string identification model, and determining a text identification result of each target image block; and integrating the text recognition results of the target image blocks according to the overlapping information of the target image blocks, and determining a text recognition result of the to-be-recognized textline image. According to the Tibetan language recognition method disclosed by the invention, the Tibetan language text line image does not need to be segmented into individual characters, but is recognized through the string recognition model in the form of image blocks, so that the recognition accuracy can be effectively improved.

Description

technical field [0001] The present application relates to the technical field of optical character recognition, and in particular to a Tibetan character recognition method, device and electronic equipment. Background technique [0002] The Tibetan image recognition technology in the prior art is mainly aimed at recognizing modern printed Tibetan images, and the general process of recognition is as follows: first, the image is preprocessed, such as gray scale, binarization, denoising and other operations; Then, character segmentation and normalization are performed on the Tibetan line text image to extract the features of a single character; after that, the features of a single character are sent to the single character recognition core to obtain the single character recognition result; finally, the single character recognition result Perform post-processing to obtain line text recognition results. Among them, the single character recognition core adopts the traditional patt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/32G06K9/34
CPCG06V30/40G06V10/242G06V30/153G06V10/267
Inventor 尼玛扎西韦秋华刘正珍拥措洛桑嘎登
Owner BEIJING HANVON DIGITAL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products