Reusable code base creation method, rapid tracing method and system for reusable codes

A technology for building methods and code bases, applied in the field of rapid traceability methods and systems, which can solve problems such as low judgment efficiency, consume a lot of time, and low recall rate, achieve high accuracy and recall rate, improve automation, and improve efficiency. Effect

Active Publication Date: 2016-11-16
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF3 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current similar function judgment technology has a high accuracy and recall rate, but the judgment efficiency is low, and it is not suitable for tracing the source of multiplexed functions of massive codes.
A small modification of the source code of a function, different compilation options, and different locations will cause differences in the order of instructions, registers, and jump positions in the assembled code after reverse engineering. Therefore, using methods such as hashing to trace the source will result in a very low recall rate.
In the function, the jump structure of the code block is an important feature of the similarity judgment, and the extraction of the jump relationship and the comparison of the structure diagram take a lot of time, which is an important factor that makes it difficult to achieve both the accuracy, recall rate and speed of the current similarity judgment. reason

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reusable code base creation method, rapid tracing method and system for reusable codes
  • Reusable code base creation method, rapid tracing method and system for reusable codes
  • Reusable code base creation method, rapid tracing method and system for reusable codes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Below, the present invention will be described in detail in conjunction with specific implementation examples.

[0032] figure 1 The process flow of the code library construction method provided by the present invention is given, and the specific implementation steps are as follows:

[0033] (1) Deshelling the shelled samples

[0034] 1) Use the PeiD shell checking tool to determine the shell used by the sample;

[0035] 2) Use different shelling tools for different shells to shell;

[0036] 3) Discard other samples that cannot be dehulled due to special shells.

[0037] The final samples are all dehulled samples.

[0038] (2) Use the reverse tool to obtain the assembly code of each sample

[0039] The present invention takes IDA Pro as an example.

[0040] (3) Extract the functions in the assembly code

[0041] In the assembly code obtained by reverse engineering, "proc near" marks the beginning of a function, and "endp" marks the end of a function, and the func...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a reusable code base creation method, a rapid tracing method and system for reusable codes. The system comprises a pre-processing module, a code base building block and a function tracing module. The pre-processing module is used for obtaining assembly codes of each sample and extracting functions of each assembly code and used for splitting functions into multiple code blocks based on a jump instruction of each function and a jump address and calculating simhash value of each code block. The code base building block is used for building code blocks corresponding to indexes of simhash value. A code block index package comprises functions of code blocks. A function index package contains three stages of reverse indexes of samples for functions. The function tracing module is used for indexing similar code blocks for the tracing function in a code library. Each similar code block corresponds to a potential similar function. Then, according to the jump relations among similar code blocks, similar functions are determined whether to be similar to to-be-traced functions. The reusable code base creation method, the rapid tracing method and system for reusable codes improve automation degree for judgment of tasks at the same source.

Description

technical field [0001] The invention relates to the field of reverse analysis and malicious code analysis, in particular to a method for constructing a multiplexed code base based on simhash and inverted index, a method for fast source tracing and a system. Background technique [0002] Code reuse usually takes functions as the basic unit. Even if they are highly optimized by the compiler, a large number of functions are still retained. Therefore, this paper uses functions as units to trace the source and determine similarity, which is more in line with the reuse scenario. The main basis for judging the same origin of malicious code is the reuse of personal code written by the malicious code author in different malicious codes. For example, the same origin judgment of Sasser and Netsky, Flame and Gauss, etc. is based on the special function shared by them. However, in order to improve the development speed, malicious code authors often reuse public or semi-public codes writt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/44G06F9/45
CPCG06F8/36G06F8/43
Inventor 张永铮乔延臣云晓春
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products