Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cross-platform binary code similarity detection method based on semantic space alignment

A binary code and space alignment technology, applied in the field of cross-platform binary code similarity detection based on semantic space alignment, can solve problems such as being unsuitable for cross-instruction architecture embedding, and achieve the effect of protecting intellectual property rights and ensuring security

Pending Publication Date: 2022-03-01
COMP APPL RES INST CHINA ACAD OF ENG PHYSICS
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods are not suitable for basic block embedding across instruction architectures, because there are independent semantic representation methods in different instruction architectures.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-platform binary code similarity detection method based on semantic space alignment
  • Cross-platform binary code similarity detection method based on semantic space alignment
  • Cross-platform binary code similarity detection method based on semantic space alignment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them; based on this The embodiments in the invention, and all other embodiments obtained by persons of ordinary skill in the art without creative efforts, all belong to the scope of protection of the present invention.

[0046] combine Figure 1 to Figure 2 ;

[0047] A cross-platform binary code similarity detection method based on semantic space alignment:

[0048] Described method specifically comprises the following steps:

[0049] Step 1: Build a cross-platform binary code text function library; select a suitable open source library (such as openssl, binutils, etc.), compile each text function in the open source library through a compiler, and obtain disas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a cross-platform binary code similarity detection method based on semantic space alignment. The cross-platform binary code similarity detection method comprises the following steps: firstly, constructing a cross-platform binary code function library; compiling each function in the open source library through a compiler, and obtaining disassembly text codes through different optimization options; then training a single-platform code semantic embedding model by adopting a Bert model; enabling the trained Bert model to be capable of identifying platform code semantics; the method comprises the following steps of: training a semantic alignment model based on Control Learning; and finally, constructing a quick search database based on the locality sensitive hashing technology. After high-dimensional vectors are converted into low-dimensional vectors, similar vector results are obtained by using a matching method; the matching result is analyzed, and the experimental model is evaluated; according to the method, the similarity matching problem of the same source codes under different platforms such as x86 and ARM is solved, and identification of the same semantics under the different platforms is realized.

Description

technical field [0001] The invention belongs to the fields of loophole detection, copyright disputes, malicious software analysis and the like, and in particular relates to a method for similarity detection of cross-platform binary codes based on semantic space alignment. Background technique [0002] Previous cross-instruction architecture binary code similarity research usually requires manual selection of binary code features for basic block embedding. These features not only require professional knowledge, but also have less embedding information, which cannot fully express the semantics of binary code. Such as Gemini, Genius. [0003] In order to solve the above problems, SAFE, Asm2vec, PalmTree and other methods apply the method based on static word representation to binary code. These methods, combined with techniques in NLP, greatly increase the information capacity in basic block embeddings by normalizing the content in basic blocks as input into the model. Howeve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F8/41G06F21/56G06F21/57G06K9/62G06F8/36
CPCG06F8/436G06F8/36G06F21/563G06F21/577G06F18/22G06F18/214
Inventor 张春瑞王莘姜训智殷明勇黄欣王振邦李冶天
Owner COMP APPL RES INST CHINA ACAD OF ENG PHYSICS
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More