Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Document image classification method and device

A document image and classification method technology, applied in the field of computer vision, can solve problems such as difficulty in obtaining multiple styles, impossibility of training, and complexity, and achieve the effects of improving the accuracy of distinction, facilitating promotion and use, and easy expansion

Active Publication Date: 2019-10-01
BEIJING YIDAO BOSHI TECH
View PDF10 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. Difficult to obtain many styles: There are too many types of document images, and different document types in different fields. It is impossible to collect all of them for training, and sometimes they are added later and cannot be obtained in advance. Some documents are confidential Yes, unable to train without desensitization
[0005] 2. Complicated collection methods: With the popularization of collection devices such as mobile phones, tablets, cameras, scanners, and cameras, especially mobile phones, document image acquisition methods have shifted from traditional scanning methods to shooting methods. At present, more than 90% of documents The images are taken rather than scanned. Due to the complex background of the taken images, compared with scanners, they are not as good as scanners in various conditions such as background, resolution, orientation, illumination, font, character size, etc., and cannot Uniform specification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document image classification method and device
  • Document image classification method and device
  • Document image classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as recited in the appended claims.

[0061] The terms "first", "second" and the like in the specification and claims of the present disclosure are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for exa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a document image classification method and device, and belongs to the field of computer vision. The method comprises: training a text feature vector extraction model and an image feature vector extraction model respectively, extracting a fusion feature vector of a document image in an embedded feature mode that text feature vectors and image feature vectors are fused, and classifying the document image based on the similarity of the fusion feature vector. According to the method, various document images can be quickly registered and classified, the service process can be greatly simplified, the OCR API can be simplified, all documents can be recognized through one API, and permanent use of one-time access is truly achieved.

Description

technical field [0001] The invention relates to the field of computer vision, in particular to a document image classification method and device. Background technique [0002] In all walks of life, there are still many paper documents that need to be saved, processed, retrieved, etc., especially in the financial field such as banking, securities, insurance, mutual funds, finance, taxation and other industries. In the past, the digitization of these paper documents was generally manually entered. With the continuous popularization of OCR technology, many industries have gradually adopted OCR recognition technology instead of manual entry, which has greatly improved work efficiency. But at present, the prerequisite for good OCR recognition and structuring is that the category of the document needs to be clearly known, otherwise it is difficult to have a good structured result. In addition, in many occasions such as bank counters, the current application is that the user must ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/20G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06V10/22G06N3/045G06F18/241
Inventor 朱军民王勇康铁钢
Owner BEIJING YIDAO BOSHI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products