Picture words segmentation method

A text segmentation and image technology, which is applied in character and pattern recognition, instruments, calculations, etc., can solve the problems of limited adaptability, poor effect of binary images, and no web page information in images, etc.

Inactive Publication Date: 2008-02-13
PEKING UNIV
View PDF0 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method has the following disadvantages: (1) how to find the text information that can accurately describe the content of the picture itself is a very difficult thing in the webpage; (2) there is no corresponding webpage information for a large number of pictures
For the text area with complex background and weak text contrast, the binary image segmented by this method has poor effect and contains more noise, so it is not widely applicable, and the effect of text segmentation needs to be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0034] In this embodiment, firstly, the text area in the picture is detected by a method of picture text detection, and then the text area picture is converted into a binary text picture by using the picture text segmentation method of the present invention. Include the following steps:

[0035] 1. Image text detection, including:

[0036] (1) Merge the edge maps of the original image on multiple color components to obtain the cumulative edge map.

[0037] The cumulative edge map is obtained by merging multiple edge maps detected by the improved Sobel edge detection operator on each YUV component of the picture. The method of merging is shown in formula 1, E is the cumulative edge map, E Y ,E U and E V are the edge maps detected by the improved Sobel edge detection operator on each YUV component of the picture, and E(x, y) is the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an image character separation method, which is used to convert the image character detection results from a character area image to an OCR software recognizable binary image, and comprises following steps: (I) carrying out color component selection for the character area image; the characters are clearest among the selected color components; (II) carrying out binaryzation for the character area image on the base of the color component selected in step (I); (III) carrying out de-noising for the binary image derived in step (II). The invention can self-adaptive select the color component appropriate for binaryzation and thus achieving a better binaryzation effect; at the same time de-noising through a color-based clustering method can help acquiring a clearer binary character image with low noise and a better image character recognizing result.

Description

technical field [0001] The invention belongs to the technical field of image processing and retrieval, and in particular relates to a method for image text segmentation. Background technique [0002] With the rapid development of Internet technology and multimedia technology, the picture content on the Internet shows an explosive growth trend. How to quickly retrieve the desired picture from the massive picture content has become a key problem that needs to be solved urgently. The existing methods are mainly based on the text description information in the webpage corresponding to the picture, but do not go deep into the analysis of the picture content. This method has the following disadvantages: (1) how to find the text information that can accurately describe the content of the picture itself is a very difficult thing in the webpage; (2) there is no corresponding webpage information for a large number of pictures. On the other hand, a large number of pictures contain tex...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/36G06K9/40G06K9/46
Inventor 易剑彭宇新肖建国
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products