Morphology and integral projection-based printed Uygur document segmentation method

A technology based on morphology and integral projection, applied in the fields of instruments, character and pattern recognition, computer parts, etc., can solve the problems of flexibility limitation, missing character segmentation, etc., and achieve the effect of overcoming flexibility limitation and wide application range.

Active Publication Date: 2017-02-01
XIDIAN UNIV
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although this method can segment the line document image in the entire Uyghur document image, the method still has the disadvantage that: this method sets a threshold in the line segmentation step to distinguish whether it is line spacing or Intra-line spacing limits the flexibility of this method; when character segmentation, there are some over-segmentation and omission-segmentation problems, which will be in the form of Such characters are over-segmented and will look like Such upper and lower covered characters are split when splitting
Although this method can avoid the missing segmentation when there is upper and lower coverage, the disadvantage of this method is that it will also affect the shape such as This character causes the problem of missing segmentation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Morphology and integral projection-based printed Uygur document segmentation method
  • Morphology and integral projection-based printed Uygur document segmentation method
  • Morphology and integral projection-based printed Uygur document segmentation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] The present invention will be further described below in conjunction with the accompanying drawings.

[0070] Refer to attached figure 1 , to further describe the specific steps of the present invention.

[0071] Step 1, input a binary image.

[0072] Input a binary image of a printed Uyghur document with a width and height of 2362×3327, which is noiseless and non-slanted.

[0073] Step 2, get row document image.

[0074] Using the morphological expansion algorithm, the input binary image is expanded to obtain an expanded image in which the characters belonging to the same document line in the printed Uyghur document image overlap each other.

[0075] A four-neighborhood seed-filled connected domain algorithm is used to extract each connected domain of the dilated image.

[0076] Using the upper side of the circumscribed rectangle of each connected domain as the upper boundary of each row document image, and the lower side as the lower boundary of each row document ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a morphology and integral projection-based printed Uygur document segmentation method, and mainly solves the problem of flexibility limitation during acquisition of a row document image and the problem of missing segmentation of a character define in the specification during acquisition of a single-character image in an existing segmentation method. The method comprises the steps of (1) inputting a binary image; (2) acquiring the row document image; (3) acquiring a sub-word image; (4) acquiring a connected segment image; (5) acquiring a connected segment image only with a main body stroke part; (6) determining a baseline domain of the connected segment image only with the main body stroke part; and (7) acquiring the single-character image. Compared with an existing printed Uygur document segmentation method, the morphology and integral projection-based printed Uygur document segmentation method has the advantages that a threshold is not set during the acquisition of the row document image, so that the flexibility is better, the problem of the missing segmentation of the character define in the specification is avoided, and the accuracy of printed Uygur document segmentation can be improved.

Description

technical field [0001] The invention belongs to the field of character segmentation in optical character classification, and further relates to a method for segmenting printed Uighur documents based on morphology and integral projection in the field of character segmentation in optical character classification. The invention can be used to segment the paper Uyghur document image scanned by a scanner into individual Uyghur character images, and do precondition work for the segmentation-based printed Uyghur document recognition. Background technique [0002] At present, printed Uyghur document recognition based on segmentation is widely used. Therefore, accurately segmenting Uyghur characters from Uyghur document images is the premise and basis for printed Uyghur document recognition. However, because the Uyghur language borrows the writing form of Arabic and Persian alphabets, which belongs to the cohesive phonetic alphabet, and its shape is similar to our Chinese cursive sc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/34
CPCG06V30/153G06V30/293
Inventor 卢朝阳王小弟李静郎潇艾合买提·阿卜力皮孜
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products