Method for determining similar character strings, method and system for file duplication checking
A technology of similar characters and files, applied in the field of paper duplication checking, can solve the problems of inconsistent subject-verb-object order, high algorithm complexity, long calculation time, etc., to achieve the effect of improving the accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0033] In order to have a clearer understanding of the technical features, purposes and effects of the present invention, the process of determining similar character strings based on the fuzzy matching method proposed by the present invention will be further described in detail with reference to the accompanying drawings.
[0034] figure 1 A schematic flowchart of a method for determining similar character strings according to an embodiment of the present invention is shown. Specifically include the following steps:
[0035] 1) Step S110, acquiring the sample file and the character array of the target file to be detected.
[0036] In this specification, a file to be detected is called a target file, and a file to be compared with the target file is called a sample file. The file types may include various forms, for example, PDF files, WORD files, or text type files.
[0037] The character array of the file is obtained by word-segmenting the text content of the file. The s...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


