Binary code similarity detection method and system based on graph matching network

A technology of binary code and detection method, which is applied in the direction of neural learning method, biological neural network model, platform integrity maintenance, etc., can solve the problems that cannot meet the requirements of large-grained program level comparison, data flow confusion robustness, etc., and achieve improvement The effect of detection accuracy and rich semantic representation

Active Publication Date: 2021-08-13
HUNAN UNIV
View PDF8 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Although binary code similarity detection based on graph representation learning has many advantages, there are three main limitations: 1) Lexical representation problem
Existing instruction-level embeddings, whether using artificial feature extraction or pre-training methods based on natural language processing, usually treat the entire instruction or part of the instruction (opcode, operand) as a word for processing, ignoring the lack of vocabulary (OOV ) problem, which leads to instruction-level data embedding very close to the origin and lack of data flow confusion robustness; 2) scalability problem; 3) existing methods cannot meet the large-grained program-level comparison requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Binary code similarity detection method and system based on graph matching network
  • Binary code similarity detection method and system based on graph matching network
  • Binary code similarity detection method and system based on graph matching network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] Such as figure 1 As shown, the binary code similarity detection method based on graph matching network in this embodiment includes:

[0053] 1) Obtain the program pair to be tested;

[0054] 2) Disassemble the program under test in the program pair to be tested, and obtain the inter-process control flow graph ICFG and its instructions;

[0055] 3) Obtain the initial feature embedding of the basic blocks in the inter-process control flow graph ICFG of the program to be tested;

[0056] 4) Obtain the final embedding of the inter-process control flow graph ICFG of the program pair under test through the graph matching neural network h G1 with h G2 ;

[0057] 5) Compute the final embedding of the interprocedural control flow graph ICFG of the program pair under test in the vector space h G1 with h G2 The similarity between them is taken as the similarity detection result of the program pair to be tested.

[0058] In this embodiment, step 2) includes: disassembli...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a binary code similarity detection method and system based on a graph matching network, and the method comprises the steps: obtaining a to-be-detected program pair, disassembling the to-be-detected program pair, and obtaining an inter-process control flow graph (ICFG) and an instruction thereof; acquiring initial feature embedding of a basic block for an inter-process control flow graph ICFG of the to-be-detected program; obtaining final embedding hG1 and hG2 of an inter-process control flow graph ICFG of the to-be-detected program pair through a graph matching neural network; and calculating the similarity between the final embedded hG1 and hG2 of the inter-process control flow graph ICFG of the to-be-detected program pair in the vector space, and taking the similarity as a similarity detection result of the to-be-detected program pair. The final embedding of the ICFG of the to-be-detected program pair is obtained through the graph matching neural network, so that rich semantic representation can be obtained, the detection accuracy can be effectively improved, and the invention has an important basic effect on code security analysis based on binary systems.

Description

technical field [0001] The invention belongs to the security field of the Internet of Things, and in particular relates to a binary code similarity detection method and system based on a graph matching network. Background technique [0002] Binary code similarity detection has important applications in many computer system security aspects related to national economy and people's livelihood, such as: vulnerability detection, software plagiarism detection, malware detection, code reconstruction, etc. With the rapid application of the Internet of Things in the field of intelligent manufacturing, the stable operation of modern military equipment, large-scale scientific research equipment, civil power, transportation, petrochemical, manufacturing and other industries is increasingly dependent on information-based control systems. Issues such as codes and vulnerabilities have become important challenges to information system security. Especially a single bug at the source code l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/56G06F8/75G06N3/04G06N3/08
CPCG06F21/563G06F8/75G06N3/08G06N3/044
Inventor 刘玉玲张云
Owner HUNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products