Supercharge Your Innovation With Domain-Expert AI Agents!

Computer file similarity identification system and method based on image analysis

A technology of image analysis and similarity, applied in the computer field, can solve the problems of lack of identification methods and low efficiency of image files, and achieve the effect of high recognition accuracy, high efficiency, and improved efficiency

Pending Publication Date: 2020-09-15
宋国训
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] It can be seen that when the existing methods for detecting file similarity are used, there are the following defects: generally, it is necessary to match and identify the overall content of the file, and the efficiency is low; for many image files, there is no effective identification means

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Computer file similarity identification system and method based on image analysis
  • Computer file similarity identification system and method based on image analysis
  • Computer file similarity identification system and method based on image analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0031] Such as figure 1 with image 3 As shown, a computer file similarity recognition system based on image analysis includes: a file attribute data extraction unit configured to extract basic attributes of two target files for comparison, and the target files are: first The target file and the second target file, the basic attributes include at least: file name, file type, file size, file location, file creation time and file modification time; the file content extraction unit is configured to open the first target file and the first target file Two target files, and extract the content of the two files, and temporarily store the extracted file content; the file content conversion unit is configured to convert the extracted file content into the corresponding image content; the file similarity recognition unit includes: A similarity recognition unit, a second similarity recognition unit, and a result generation unit; the first similarity recognition unit is configured to deter...

Embodiment 2

[0034] On the basis of the previous embodiment, the first similarity recognition unit judges the similarity of the two files according to the basic attributes of the two files, and the method for obtaining the first judgment result executes the following steps: The file name, file type, file size, file location, file creation time, and file modification time of the file and the second target file are matched and recognized; the matching recognition method is: treating each character in the matching item, Obtain the keyword to which the character belongs and the index position of the character in the keyword according to the keyword set; judge whether the character is the first of the keyword according to the index position of the character in the keyword to which it belongs Character; if the character is the first character of the keyword, record the keyword to which the character belongs in the matching information set, and mark in the record that the first character of the key...

Embodiment 3

[0039] reference Figure 4 with Figure 5 On the basis of the previous embodiment, the second similarity recognition unit includes: a local probability model estimation subunit, configured to use the following formula to calculate the probability of each local area of ​​the image content: Among them, i is the number of each local area, n is the number of local areas, σ(x i ) Represents the local area x i The probability of each local area x i Is a matrix, Is the transpose of the matrix, w i Is the preset template matrix, b i Is the adjustment value corresponding to the matrix, the value range is: 5-10, m is the probability adjustment value, the value range is: 0.2-0.6; the local area weight calculation subunit, according to the probability of the local area to calculate each The weight value of a local area is used as the weight value of the local area; the image segmentation subunit is configured to segment the image content of the second target file into unit domains; the unit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of computers, in particular to a computer file similarity recognition system and method based on image analysis. The system comprises: a file attribute data extraction unit configured to extract basic attributes of two target files used for comparison, wherein the target files include a first target file and a second target file, and the basic attributes at least comprise a file name, a file type, a file size, a file position, file creation time and file modification time; and a file content extraction unit which is configured to open the first target file and the second target file, extract contents of the two files and temporarily store the extracted file contents. By analyzing the file attribute data and converting the file content into the image content for similarity matching analysis, the similarity of the file can be accurately identified, and the method has the advantages of high identification accuracy and high efficiency.

Description

Technical field [0001] The invention belongs to the field of computer technology, and specifically relates to a computer file similarity recognition system and method based on image analysis. Background technique [0002] The file similarity calculation method is a method of analyzing and calculating the similarity of the files using the information of the file itself (file content and connection information). With the progress of the times, file similarity calculation methods have been widely used in various fields (such as information retrieval, collaborative recommendation systems, library classification systems and other related fields). [0003] The existing methods for detecting file similarity generally include the following steps: [0004] (1) After basic simplification processing is performed on each file in the submitted file set, each file is divided into continuous marker blocks; a certain number of representative marker blocks are reserved in the marker blocks; the repr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62G06K9/34
CPCG06V30/418G06V10/26G06V30/10G06F18/22
Inventor 宋国训魏磊仲伟付杨秀红刘曌
Owner 宋国训
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More