Instruction Set Independent Binary Code Similarity Detection Method Based on Neural Network

A binary code and neural network technology, applied in the fields of computer program detection, binary program vulnerability mining and reverse analysis, to achieve the effects of easy promotion, high accuracy and simple technology

Active Publication Date: 2018-09-07
INST OF INFORMATION ENG CHINESE ACAD OF SCI +1
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] At present, there is a lack of a simple, instruction-set-independent binary code similarity detection technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Instruction Set Independent Binary Code Similarity Detection Method Based on Neural Network
  • Instruction Set Independent Binary Code Similarity Detection Method Based on Neural Network
  • Instruction Set Independent Binary Code Similarity Detection Method Based on Neural Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the above objects, features and advantages of the present invention more obvious and understandable, the present invention will be further described below through specific embodiments and accompanying drawings.

[0030] A kind of instruction set irrelevant binary similar code retrieval method based on neural network of the present invention, specifically comprises the following steps:

[0031] 1) Construct a training set sample. Choose the same source code, choose different compilers, different optimization options, compile for different architectures, and obtain binary executable files. Perform reverse analysis on the binary executable file, and extract 9 aspects of features for each function. The characteristics of these 9 aspects are call relationship characteristics, string characteristics, stack space characteristics, code scale characteristics, path sequence characteristics, path basic characteristics, degree sequence characteristics, degree basi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a binary code similarity detection method based on a neural network-based instruction set irrelevant, the main steps of which include: performing reverse analysis on binary files, extracting function call relationship features, character string features, stack space features, and code size features There are 24 features in 9 aspects including path sequence features, path basic features, degree sequence features, degree basic features, and graph scale features. Based on the representation of the features, three similarity calculation methods are used to calculate the similarity of the 24 features of the two functions to be compared, and as the input vector of the integrated neural network classifier, the predicted value of the overall similarity of the two functions is obtained and carried out. Sort. Compared with the prior art, the invention does not rely on specific instruction sets, and can realize the similarity detection of binary files of different instruction sets, has high accuracy, simple technology, and is easy to popularize.

Description

technical field [0001] The invention relates to the field of binary program loophole mining and reverse analysis, in particular to a neural network-based instruction set-independent binary code similarity detection method, which belongs to the technical field of computer program detection. Background technique [0002] With the rise of open source software, there are more and more software plagiarism phenomena, and the demand for detecting whether the code is plagiarized is also increasing. In practical applications, most commercial software exists in the form of binary code, and the source code is difficult to obtain. Therefore, the method for judging whether the code is plagiarized mainly adopts binary code similarity detection technology. [0003] Binary code similarity detection technology relies on various similarity calculation methods to measure the similarity of two binary codes to be compared, which can be divided into text-based similarity detection technology, gr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/36G06K9/62
CPCG06F11/3608G06F18/214G06F18/24
Inventor 石志强刘中金常青陈昱孙利民朱红松王猛涛何跃鹰
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products