Method and device for extracting text from image

A technology for extracting text and images, applied in the field of image processing, can solve problems such as poor accuracy, and achieve the effect of improving accuracy and effect

Active Publication Date: 2013-02-13
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is to provide a method and device for extracting text li

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for extracting text from image
  • Method and device for extracting text from image
  • Method and device for extracting text from image

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0045] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0046] Please refer to figure 1 , figure 1 It is a schematic flowchart of Embodiment 1 of the method for extracting text lines from an image in the present invention. Such as figure 1 As shown, this embodiment includes:

[0047] Step S101: Binarize the image to obtain each connected domain of the image.

[0048] Step S102: Filter the connected domains that do not satisfy the first statistical characteristic.

[0049] Step S103: Extract text lines in the image from each connected domain after filtering.

[0050] The above method will be described in detail below.

[0051] Step S101 performs binarization processing on the image, which is a common technique in image preprocessing, and its purpose is to separate the text foreground area and the background ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for extracting a text from an image, and the method for extracting the text from the image comprises the following steps of A, conducting binaryzation processing for the image to obtain all connected domains of the image; B, filtering the connected domains which cannot satisfy a first statistics characteristics, wherein the first statistics characteristics are the statistics characteristics which are obtained after statically learning description characteristics of the extracted connected domain in a well-labeled sample and belong to the text connected domain; and C, extracting the text from the image from each connected domain after being filtered. Through the method, the accuracy for extracting the text from the image can be greatly improved.

Description

【Technical field】 [0001] The invention relates to image processing technology, in particular to a method and device for extracting character lines from images. 【Background technique】 [0002] Extracting text lines from images can not only be applied to text recognition of scanned documents, but also can be applied to text recognition of natural scene images. The accuracy of character line extraction directly determines the effect of character recognition. [0003] As a prior art, the Chinese invention patent application number 201010568411.2 discloses a method for extracting text lines from an image. It can be seen from the patent document that the prior art achieves the purpose of character line extraction by binarizing the image and then directly extracting the character lines in the connected domain of the binarized image. [0004] The text line extraction method of the prior art does not consider the noise influence of a large number of non-text areas existing in natur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/20G06K9/54
Inventor 韩钧宇刘经拓丁二锐
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products