Instruction-set-irrelevant binary code similarity detection method based on neural network

A binary code and neural network technology, applied in the fields of computer program detection, binary program vulnerability mining and reverse analysis, to achieve the effect of simple technology, easy promotion and high accuracy

Active Publication Date: 2016-08-17
INST OF INFORMATION ENG CAS +1
View PDF2 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] At present, there is a lack of a simple, instruction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Instruction-set-irrelevant binary code similarity detection method based on neural network
  • Instruction-set-irrelevant binary code similarity detection method based on neural network
  • Instruction-set-irrelevant binary code similarity detection method based on neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the above objects, features and advantages of the present invention more obvious and understandable, the present invention will be further described below through specific embodiments and accompanying drawings.

[0030] A kind of instruction set irrelevant binary similar code retrieval method based on neural network of the present invention, specifically comprises the following steps:

[0031] 1) Construct a training set sample. Choose the same source code, choose different compilers, different optimization options, compile for different architectures, and obtain binary executable files. Perform reverse analysis on the binary executable file, and extract 9 aspects of features for each function. The characteristics of these 9 aspects are call relationship characteristics, string characteristics, stack space characteristics, code scale characteristics, path sequence characteristics, path basic characteristics, degree sequence characteristics, degree basi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an instruction-set-irrelevant binary code similarity detection method based on a neural network. The method mainly includes the steps that binary files are reversely analyzed, and 24 features on the 9 aspects of call relation features, character string features, stack space features, code scale features, path sequence features, path basic features, degree sequence features, degree basic features and map scale features of functions are extracted. Based on expression forms of the features, the similarity degrees of the 24 features of the two to-be-compared functions are calculated through 3 similarity calculation methods and serve as input vectors of an integrated neural network classifier, and predicted values of the overall similarity between the two functions are acquired and ranked. Compared with the prior art, dependence on specific instruction sets is avoided, similarity detection of binary files of different instruction sets can be achieved, accuracy is high, the technology is simple, and popularization is easy.

Description

technical field [0001] The invention relates to the field of binary program loophole mining and reverse analysis, in particular to a neural network-based instruction set-independent binary code similarity detection method, which belongs to the technical field of computer program detection. Background technique [0002] With the rise of open source software, there are more and more software plagiarism phenomena, and the demand for detecting whether the code is plagiarized is also increasing. In practical applications, most commercial software exists in the form of binary code, and the source code is difficult to obtain. Therefore, the method for judging whether the code is plagiarized mainly adopts binary code similarity detection technology. [0003] Binary code similarity detection technology relies on various similarity calculation methods to measure the similarity of two binary codes to be compared, which can be divided into text-based similarity detection technology, gr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/36G06K9/62
CPCG06F11/3608G06F18/214G06F18/24
Inventor 石志强刘中金常青陈昱孙利民朱红松王猛涛
Owner INST OF INFORMATION ENG CAS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products