Recognition method of printed mongolian character

A recognition method and print technology, applied in the field of character recognition, can solve the problems of many subsets of similar characters, difficult extraction, and lack of rich stroke structure information, etc.

Inactive Publication Date: 2007-08-15
TSINGHUA UNIV
View PDF0 Cites 42 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The number of character strokes is small and the strokes are mainly composed of arcs, the stroke structure information is not rich and difficult to extract, there are many subsets of similar characters in the character set, the similarity is extremely high, the width and height of the characters are not consistent, and the upper and lower boundaries of the characters are inconsistent. Determinism, large differences in fonts between different fonts, some fonts close to handwritten cursive, and small commonly used font sizes have brought great challenges to the character recognition research of Mongolian character sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Recognition method of printed mongolian character
  • Recognition method of printed mongolian character
  • Recognition method of printed mongolian character

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0379] Embodiment 1: multi-font and multi-size printed Mongolian character recognition system

[0380] Based on the multi-font and multi-size printed Mongolian character recognition system of the present invention as shown in Figure 14, the hardware equipment platform of the experiment is a scanner (model: Uniscan 1248US) and a common PC (CPU: Intel _ Pentium _ 43.00GHz; Memory: 1.00GB RAM; OS: Microsoft _ Windows _ XP), the experiment is carried out on 1600 sets of printed documents collected, most of these sample documents are collected from the major Mongolian publishing systems today, and a small amount are directly printed by Windows TrueType fonts. Fonts include most of the most commonly used, some of the less commonly used and a small number of uncommonly used fonts, a total of 26 types. The font size ranges from small five to first. The sample quality varies, and the ratio of normal, broken, and glued characters is about 2:1:1. After scanning input, text line segm...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This invention relates to one print Mongolia character identification method, which comprises the following steps: extracting the character integral types information and character information pre-sorting to determine the input characters type; then drawing the direction property of the strokes information; on this base adopting two steps for property optimization process; finally by use of MQDF to realize sort judgment of statistic sorter.

Description

technical field [0001] A printed Mongolian character recognition method belongs to the field of character recognition. Background technique [0002] Mongolian belongs to the Mongolian branch of the Altaic language family and is the main language spoken by Mongolians widely distributed in Inner Mongolia, Xinjiang, Beijing, Liaoning, Heilongjiang, Jilin, Gansu, Qinghai and other provinces and regions. Its written expression form—Mongolian (current) is a phonetic text based on the Uyghur alphabet, which has unique features in terms of shape and writing changes. [0003] Mongolian is written or printed vertically in units of words, and words are separated by obvious spaces. Each word is composed of one or more letters, and inside the word, the characters are connected to each other along the baseline (Fig. 6). Mongolian has a total of 35 letters, including 7 vowels and 28 consonants. These letters are the nominal forms of Mongolian characters. Each letter has three different ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/72G06K9/00
Inventor 丁晓青王华彭良瑞刘长松方驰文迪
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products