Unlock instant, AI-driven research and patent intelligence for your innovation.

Document type identification method and device, equipment and computer readable storage medium

A document type and identification method technology, applied in the computer field, can solve the problem of low document type accuracy, and achieve the effect of avoiding inaccurate document type identification and high identification accuracy

Pending Publication Date: 2021-06-29
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The main purpose of the present invention is to provide a document type identification method, device, device and computer-readable storage medium to solve the problem of low accuracy in identifying the document type of an electronic document through the suffix of the electronic document name

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document type identification method and device, equipment and computer readable storage medium
  • Document type identification method and device, equipment and computer readable storage medium
  • Document type identification method and device, equipment and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0025] According to an embodiment of the present invention, a document type identification method is provided. like figure 1 Shown is a flowchart of a document type identification method according to an embodiment of the present invention.

[0026] Step S110, reading the electronic document in code form to obtain the document code of the electronic document.

[0027] In this embodiment, the electronic document is stored in the form of code in the computer.

[0028] Furthermore, the electronic document can be stored in the form of binary code or hexadecimal code in the computer, so the electronic document in the form of binary code or hexadecimal code can be read.

[0029] Step S120, among various preset code characteristics, identify t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a document type identification method and device, equipment and a computer readable storage medium. The method comprises the following steps: reading an electronic document in a code form to obtain a document code of the electronic document; identifying a code feature conforming to the document code in a plurality of preset code features; determining whether the electronic document is a structural document or not according to the code features conforming to the document code and the document structure information of the electronic document; if it is determined that the electronic document is the structural document, identifying the electronic document as a document type corresponding to the document structural information of the electronic document; and otherwise, identifying the electronic document as the document type corresponding to the code feature conforming to the document code. Compared with the prior art, the embodiment depends on the code features and the document structure information of the electronic document and is not limited by application scenes, so that the recognition accuracy of the document type is high, and the problem that the document type is inaccurately recognized according to the electronic document name suffix of the electronic document is avoided.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a document type identification method, device, equipment and computer-readable storage medium. Background technique [0002] There are various formats of electronic documents, such as doc, docx, ppt, pptx, xls, xlsx, pdf, rar, zip, 7z and other types of documents. At present, the document type of the electronic document is usually identified according to the suffix of the electronic document name. For example: the .doc suffix indicates the doc document of word97-2003, and the .docx suffix indicates the doc document of word2007 and later versions. However, the method of identifying the document type of the electronic document through the suffix of the electronic document name has the following disadvantages: [0003] First, since the suffix of the electronic document name can be changed, it is easy to make mistakes by identifying the document type through the suffix. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/18
CPCG06F16/18
Inventor 童陈敏
Owner BEIJING GRIDSUM TECH CO LTD