A cross-platform malicious code detection method and system
A malicious code detection, cross-platform technology, applied in the field of software security technology protection
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0059] A cross-platform malicious code detection method, comprising the following steps:
[0060] (1) Use large-scale benign program samples on multiple platforms to train a pre-training model (Pre-trainModel) to capture the structure and semantic correlation in the context of program instructions and the structure and semantic commonality between program instructions on different platforms;
[0061] (2) On top of the pre-training model, a cross-platform malicious code detection model is constructed using limited-scale benign program samples and malicious program samples from multiple platforms, and the parameters of the cross-platform malicious code detection model are fine-tuned, and the knowledge in the pre-trained model is applied. Migrated to a cross-platform malicious code detection model;
[0062] (3) Use the constructed cross-platform malicious code detection model to detect unknown program samples on different platforms (including platforms not involved in pre-trainin...
Embodiment 2
[0065] According to a cross-platform malicious code detection method described in Embodiment 1, as figure 2 shown, the difference is:
[0066] The specific implementation process of step (1) is as follows:
[0067] 1.1: Collect large-scale benign program samples on Windows, Andriod, Linux, and localized platforms, and construct a multi-platform benign program dataset D, where the samples in D are represented as U i =[C i ,W i ]; where C i ={C 1 ,C 2 ,...,C n } represents the program instructions of the ith sample, set C i The subscript n in the middle represents the total number of program instructions (token); W i ={W 1 ,W 2 ,...,W m } represents the annotation of the ith sample, set W i The subscript m represents the total number of annotation words;
[0068] 1.2: As image 3 As shown, the pre-training model M is constructed based on the multi-layer Transformer encoder, and the multi-platform benign program data set D is used to pre-train the pre-training mode...
Embodiment 3
[0083] A cross-platform malicious code detection method according to Embodiment 2, the difference is:
[0084] like Figure 4 As shown, the specific implementation process of step (2) is as follows:
[0085] 2.1: Build a malicious code detection model M' on top of the pre-trained model M, and the malicious code detection model M' includes a pre-trained model M and a linear classifier K;
[0086] The architecture of the malicious code detection model M' is to connect a linear classifier K to the pre-trained model M. The architecture of the malicious code detection model M' is as follows Image 6 shown.
[0087] 2.2: In order to better learn the structure and semantic features of malicious codes on different platforms, build a dataset D' and train the malicious code detection model M'. The dataset D' includes malicious code samples and benign codes from various platforms sample. In order to avoid the malicious code detection model M' being biased towards the category with mo...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


