Malicious code data imbalance processing method based on swarm intelligence algorithm and cGAN

A swarm intelligence algorithm and malicious code technology, applied to the data balance strategy in the malicious code classification problem, in the field of malicious code data imbalance processing, can solve the problems of sample data set imbalance, poor model performance, data imbalance, etc., to achieve training Simple and fast, solve the effect of data imbalance

Pending Publication Date: 2021-05-14
BEIJING UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] In order to solve the problem of poor performance of the trained model caused by the unbalanced sample data set in the malicious code detection problem, the present invention proposes a new method to solve the unbalanced data. Generative Adversarial Networks) enhances the sample data of each family, and then selects the typical swarm intelligence algorithm Particle Swarm Optimization (PSO) to calculate the sample ratio of each family of malicious code according to the characteristics of swarm intelligence algorithm that is good at solving optimization combination problems , data enhancement is performed according to the ratio; finally, the original data set and the data set generated according to the ratio are used to construct a malicious code data set with relatively balanced sample data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Malicious code data imbalance processing method based on swarm intelligence algorithm and cGAN
  • Malicious code data imbalance processing method based on swarm intelligence algorithm and cGAN
  • Malicious code data imbalance processing method based on swarm intelligence algorithm and cGAN

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The present invention is explained and elaborated below in conjunction with relevant accompanying drawings:

[0041] In order to make the object, technical solution and features of the present invention clearer, the present invention will be further described in detail below in conjunction with specific implementation examples and with reference to the accompanying drawings.

[0042] The flow chart of the construction of the malicious code balance data set is as follows: figure 1 shown, including the following steps:

[0043] Step S10, constructing a malicious code generation model;

[0044] Step S20, using the swarm intelligence algorithm to calculate the acceptable optimal initial sample ratio of the malicious code;

[0045] Step S30, generating malicious codes of each family, and constructing a relatively balanced malicious code data set.

[0046] The step S10 of constructing the malicious code generation model of the embodiment also includes the following steps: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a malicious code data imbalance processing method based on a swarm intelligence algorithm and a cGAN. A malicious code generation model is constructed. And calculating an acceptable optimal initial sample proportion of the malicious code by adopting a swarm intelligence algorithm. Generating malicious codes of each family, and constructing a relatively balanced malicious code data set. According to the method, an acceptable optimal sample proportion of each malicious code family is obtained by utilizing a swarm intelligence algorithm, cGAN is introduced to learn data distribution of different families of malicious codes and perform sample generation, and finally, an unbalanced data set is processed to construct a malicious code data set with relatively balanced samples, so that the malicious code data set is optimized. Different types of malicious codes reach an ideal proportion during selection, positive and negative samples have the same status in the training process, and the problem of data imbalance is more effectively solved.

Description

technical field [0001] The invention belongs to the field of information security, in particular to a method for processing imbalanced malicious code data based on a swarm intelligence algorithm and cGAN, and belongs to a data balancing strategy in malicious code classification problems. Background technique [0002] With the rapid development of information technology, the Internet has become an important part of our daily life, which brings many benefits to our life, study, and work, but at the same time hides many security threats such as Trojan horse viruses, phishing websites, and malware. problem, where malicious code is one of the main security threats. Driven by economic interests, the number of new malware samples is exploding. Anti-malware vendors are faced with millions of potentially malicious code samples every year. In order to continue to combat the increase of malicious code samples, research needs to rely on large, high-quality samples to build efficient ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/56G06K9/62
CPCG06F21/563G06F18/24G06F18/214
Inventor 梁军淼宁振虎曹东芝公备
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products