Document detection processing method and device, storage medium and electronic equipment

A processing method and document technology, applied in the field of data processing, can solve problems such as the inability to recognize excerpts from a small number of documents, and the inability to calculate document overlap

Pending Publication Date: 2022-05-06
北京明朝万达科技股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a document detection and processing method, device, storage medium, and electronic equipment to at least solve the technical...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document detection processing method and device, storage medium and electronic equipment
  • Document detection processing method and device, storage medium and electronic equipment
  • Document detection processing method and device, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] According to an embodiment of the present invention, an embodiment of a document detection and processing method is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0035] figure 1 is a flow chart of a document detection processing method according to an embodiment of the present invention, such as figure 1 As shown, the method includes the following steps:

[0036] Step S102, receiving a document detection request sent by the client, wherein the document detection request carries the document to be detected or is used to obtain relevant information of the document to be detected, and the document detection request is used to request the server to detect the d...

Embodiment 2

[0103] According to an embodiment of the present invention, an embodiment of an apparatus for implementing the above document detection and processing method is also provided, Figure 7 is a schematic structural diagram of a document detection and processing device according to an embodiment of the present invention, such as Figure 7 As shown, the above-mentioned document detection processing device includes: a receiving unit 70, a processing unit 72, a searching unit 74 and a computing unit 76, wherein:

[0104] The receiving unit 70 is configured to receive a document detection request sent by the client, wherein the above-mentioned document detection request carries the document to be detected or is used to obtain relevant information of the document to be detected, and the above-mentioned document detection request is used to request the server to detect the above-mentioned to-be-detected document The coincidence value between the document content in the document and the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a document detection processing method and device, a storage medium and electronic equipment. The method comprises the following steps: receiving a document detection request sent by a client; performing text processing and hash table generation processing on the read document content in the to-be-detected document to obtain a first hash signature; searching a minimum Hash signature list in a pre-stored document fingerprint database according to the first Hash signature to obtain an index value of a second Hash signature having coincident elements with the first Hash signature; and positioning a sample document corresponding to the second hash signature through the index value, and calculating a similarity value between the first hash signature and the second hash signature to obtain an overlap ratio value between the to-be-detected document and the sample document. The technical problems that in the prior art, a document detection processing method cannot recognize extraction conditions of a small number of documents, and the document overlap ratio cannot be calculated are solved.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a document detection and processing method, device, storage medium and electronic equipment. Background technique [0002] With the continuous development of science and technology, the amount of data of each enterprise or user continues to increase, the convenience of the network intensifies data sharing, and increases the risk of intentional or unintentional leakage of confidential data of units and enterprises. Therefore, each enterprise needs to prevent the leakage of secret documents. Leakage, but can not be disconnected from the network and cut off from the outside world; in the prior art, local sensitive hash algorithm and / or text file fingerprint generation algorithm are usually used to obtain the minimum or maximum hash value, and the hash value The file fingerprint representing the content of this paragraph of text, and finally the file recognition effec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/194G06F16/31
CPCG06F40/194G06F16/325
Inventor 王奎举喻波王志海韩振国安鹏
Owner 北京明朝万达科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products