Method for classifying business card character clauses and device thereof

A classification method and character technology, applied in the field of optical character recognition, can solve the problems of classification performance impact, failure to consider semantic information error-tolerant matching, and failure to consider the contribution difference of different keyword classification performance, so as to improve the classification accuracy and improve The effect of classification performance

Active Publication Date: 2010-06-23
HANVON CORP
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The disadvantage of the above technology is that the first two patent applications only use layout logic structure features, and several layout logic structure templates given in the paper are too restrictive to adapt to the changing layout structure of business cards; the third patent application is also the main Using layout logic structure features, semantic information is only used for address entries; th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for classifying business card character clauses and device thereof
  • Method for classifying business card character clauses and device thereof
  • Method for classifying business card character clauses and device thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to understand the technical content of the present invention more clearly, the following examples are given in detail. The technical solution proposed by the present invention is applicable to business cards in any language, and is not limited to the scope of this embodiment. The thresholds of the formulas used in this embodiment are set according to specific languages, and these thresholds can be reset according to actual needs in business cards of different languages. In this embodiment, there are 12 categories of character entries, which are respectively name, title, degree, department, unit, address, zip code, telephone, fax, mobile phone, email and web page, but in other implementations, the number of categories and specific attributes It can be set according to actual needs, and is not limited by this embodiment.

[0029] First, the terms involved in the present invention are defined, wherein "OCR result" refers to the computer-recognizable result obtain...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a method for classifying business card character clauses and a device thereof, belonging to the field of optical character recognition. The method comprises the following steps: (a) a complete match classifying step according to guide terms, in which complete match test is carried out for the character clauses with all guide terms in a complete match guide term table one by one, the character clauses passing the test of the step are sent to classified results, and the character clauses failing the test go to the next step; (b) a fault-tolerant match classifying step according to guide terms, in which the character clauses passing the test of the step are sent to classified results, and the character clauses failing the test go to the next step; (c) a fault-tolerant match classifying step according to keywords, in which the character clauses passing the test of the step are sent to classified results, and the character clauses failing the test go to the next step; (d) a classifying step according to the logic structure characteristics of page layouts. The business card character clauses of various page layout structures can be quickly and accurately classified by the present invention.

Description

technical field [0001] The present invention relates to optical character recognition (Optical Character Recognition, hereinafter referred to as: OCR), in particular to a method and device for classifying character entries of business cards. Background technique [0002] In today's business activities, business cards are already an important information carrier for business partners, customers, etc. Faced with a large number of business cards, companies and individuals need a way to automatically and accurately collect and process information. The current practice is generally It is to first obtain the image of the business card (such as input by mobile phone, digital camera, scanner, etc.), then analyze the layout physical structure of the business card image to locate the character area, and obtain the binary image of the character area through image processing, and then analyze the two The last and most important step is to understand the category attributes of the charac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 李永彬朱军民刘正珍
Owner HANVON CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products