Malicious code detection method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A malicious code detection and malicious code technology, applied in the direction of instruments, calculations, electrical digital data processing, etc., can solve the problems of fuzzy category boundaries, missing data part features, no obvious improvement, etc.

Active Publication Date: 2020-05-12

GUANGZHOU UNIVERSITY

View PDF7 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the model proposed by Tian et al. only focused on the number and frequency of Windows APIs, without considering the context contained in the API call sequence, and lost some characteristics of the data.

The model proposed by Shifu Hou et al. merges the clusters containing multiple categories of data into a large mixed cluster after clustering. Some of these clusters are caused by the distribution of data as non-spherical clusters or fuzzy category boundaries. Part of the data points are doped with each other, but some clusters are reduced in purity due to the inclusion of a small number of outliers or noise points

For the latter, the data subsets corresponding to the mixed clusters are often highly unbalanced in data distribution, and there is no obvious improvement after the clusters are merged. The noise points generated in the process may also affect the accuracy of classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0121] In this embodiment, a malicious code detection method is designed. A malicious code detection model based on feature integration and data partitioning is designed. The model is mainly divided into two parts. The first part uses TF-IDF (Term Frequency-Inverse Document Frequency, term frequency-inverse text frequency ) and the Doc2vec algorithm to extract the features of the action sequence of the malicious code. The second part is based on the first part, and uses the clustering-based ensemble classification improvement model to classify the malicious code. Such as figure 1 As shown, the specific content of this embodiment is as follows:

[0122] S1, TF-IDF and Doc2vec extract malicious code family feature fusion

[0123] Treat the Windows API action sequence in the running process of each malicious code as a contextual text, and use TF-IDF and Doc2vec for feature extraction;

[0124] TF-IDF is a statistical method used to evaluate the importance of specific words in t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a malicious code detection method and system, and the method comprises the steps: S1, enabling a Windows API action sequence in an operation process of each malicious code to serve as a text with a context relation, and respectively carrying out the feature extraction through TF-IDF and Doc2vec; s2, after a TF-IDF feature matrix and a Doc2vec feature matrix are obtained respectively, splicing the features extracted by the TF-IDF and the Doc2vec, and obtaining a feature matrix of the malicious code after dimensionality reduction; s3, constructing an integrated classification improved model based on clustering, classifying the data set by adopting a plurality of base learners; and S4, in a prediction stage, respectively inputting the samples into the nearest single class cluster / SVM classifier in each base learner, outputting a prediction class, and finally according to a voting principle, taking the class occupying the majority in the learner output classes as afinal prediction class. According to the method, the TF-IDF and the Doc2vec are combined, the API frequency in the malicious code action sequence is considered, the context association of the action sequence is also considered, and the malicious code detection accuracy is improved.

Description

technical field [0001] The invention belongs to the technical field of network security, and in particular relates to a malicious code detection method and system. Background technique [0002] Malicious code detection has always been one of the focuses of attention in the field of network security. Malicious programs such as Trojan horse virus, worm virus, mining virus, and ransomware virus invade the system, tamper with files, and steal information by stealthily injecting and running malicious code. , Enterprises, personal privacy security and property security are a huge threat. With the continuous confrontation and upgrading of malicious code attack and defense technologies, the development of malicious code gradually tends to be multi-variant, highly concealed, large in number, and updated quickly. At present, the analysis techniques for malicious code can be divided into static analysis and dynamic analysis. Among them, the dynamic analysis technology pays attention t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F21/56G06K9/62

CPCG06F21/563G06F18/23213G06F18/2411G06F18/214

Inventor 范美华李树栋吴晓波韩伟红杨航锋付潇鹏方滨兴田志宏殷丽华顾钊铨仇晶李默涵唐可可

Owner GUANGZHOU UNIVERSITY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Malicious code detection method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology