Method and device for cutting Chinese and English mixed text images

A cutting method and cutting device technology, which are applied in the field of image processing, can solve the problem of low recognition accuracy of Chinese characters, and achieve the effects of improving the recognition accuracy and reducing misreading.

Inactive Publication Date: 2016-02-24
深圳深讯和科技有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Therefore, in the traditional technology, the recognition accuracy of Chinese characters in the text line image is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for cutting Chinese and English mixed text images
  • Method and device for cutting Chinese and English mixed text images
  • Method and device for cutting Chinese and English mixed text images

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In one embodiment, such as figure 1 As shown, a cutting method of Chinese and English mixed text images, including:

[0040] Step S102, acquiring the text line image area, and acquiring the line height of the text line image area.

[0041] The text line image region is the connected domain of a single line of text in a binary image. Before performing OCR (Optical character recognition, optical character recognition) on images such as business card photos and text scans, the text line image area in the binarized image is usually extracted first, and then each text line image area is further processed. identification. The line height of the extracted text line image area is the height of the connected domain of the text line image area, that is, the height of the tallest character in the text line image area. For example, the line height of the text line image area "Access" is the height of the character "A".

[0042] Step S104, segmenting the image area of ​​the text...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for segmenting Chinese and English mixed typeset character images. The method includes acquiring a text line image area and acquiring the line height of the text line image area; splitting the text line image area by a projection process to acquire character blocks; acquiring block heights and block widths of the character blocks; and extracting Chinese character areas according to the line height, the block heights and the block widths. In addition, the invention discloses a device for segmenting the Chinese and English mixed typeset character images. The method and the device for segmenting the Chinese and English mixed typeset character images have the advantage that the recognition precision rate can be increased.

Description

technical field [0001] The present invention relates to the field of image processing, in particular to a cutting method and device for Chinese and English mixed character images. Background technique [0002] The text line image is a rectangular image whose image content is the text line content, for example, the image area containing text information in the business card image collected by the business card recognition software. In the traditional technology, when recognizing a text image with mixed Chinese and English characters, the Chinese characters are usually misread due to the radical part of the Chinese characters. For example, if the Chinese character "引" is directly recognized by OCR recognition software, there is a high probability that it will be misrecognized as the Chinese character "别" and the English character "I". [0003] Therefore, in the traditional technology, the recognition accuracy of Chinese characters in the text line image is low. Contents of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/34G06T7/00
Inventor 李冰陈小平肖方明汪利
Owner 深圳深讯和科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products