Identification method of incomplete Chinese character

A technology of Chinese character recognition and Chinese characters, applied in the field of Chinese information processing, to achieve the effect of increasing effectiveness and accuracy and solving poor accuracy

Active Publication Date: 2018-05-15
KUNMING UNIV OF SCI & TECH
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to provide a method for identifying incomplete Chinese characters in view of the limitations and deficiencies of the prior art, so as to solve the phenomenon that the identification of incomplete Chinese characters in the prior art is labor-intensive and poor in accuracy, and is committed to increasing the current reliance on incomplete Chinese characters. Effectiveness and Accuracy of Computer Recognition of Incomplete Chinese Characters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Identification method of incomplete Chinese character
  • Identification method of incomplete Chinese character
  • Identification method of incomplete Chinese character

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] Embodiment 1: as figure 1 As shown, a method for identifying incomplete Chinese characters specifically includes the following steps:

[0035] Step0: Extract Chinese character features and build a Chinese character feature database. According to the 15×16 pixel Chinese dot matrix font library, the dot matrix is ​​divided into 40 small matrices of 2×3 pixels according to the rules from left to right and from top to bottom, and the number of pixels occupied by Chinese characters in the 2×3 pixel matrix is ​​recorded for p i ,i∈[1,40], observe all p i ,i∈[1,40] and generate the Chinese character feature vector {p 1 ,p 2 ,...,p 40}, and store all Chinese characters and the generated Chinese character feature vectors into the database to form a Chinese character feature database P:{P 1 ,P 2 ,...,P N};

[0036] Step1: Using modern scanning technology and the shape characteristics of Chinese characters, extract the picture of the incomplete Chinese character X to be det...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an identification method of an incomplete Chinese character, and belongs to the technical field of Chinese information processing. According to the method, a Chinese characterfeature database is established through a Chinese lattice character library; the any to-be-detected incomplete Chinese character is converted into an image through modern scanning technology and Chinese character shape features; the same is grayed and binarized, then character features are extracted, and a feature vector is generated; cosine theorem-based character shape similarity degrees and Euclidean distance-based character shape similarity degrees of the same with existing Chinese characters in the database are respectively calculated; and finally, a similar-character set of the to-be-detected incomplete Chinese character is obtained through a similarity fusion algorithm and similarity threshold determination. Compared with the prior art, the method mainly solves problems of phenomena of manpower consumption, poor accuracy and the like of the prior art, and is dedicated to improving validity and accuracy of currently relying on computers to identify incomplete Chinese characters.

Description

technical field [0001] The invention relates to a method for recognizing incomplete Chinese characters, which belongs to the technical field of Chinese information processing. Background technique [0002] In the investigation of cultural relics and the identification of important documents, some Chinese characters may be partially erased for some reason. Correctly identifying these incomplete Chinese characters is of great significance for the study of modern history and the investigation of famous quotations. [0003] At present, the recognition of incomplete Chinese characters is mainly based on people's familiarity with Chinese characters and manual comparison with Chinese dictionaries, and then reasoning based on contextual information. However, due to the universality of Chinese characters, this work is time-consuming and cumbersome. If the encoding method is the basic character set of Unicode, there are 20,902 Chinese characters in total. Even if it is possible to fil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/38
CPCG06V10/28G06V10/757G06F18/22
Inventor 彭艺尹玉梅
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products