Webshell detection method and apparatus based on deep learning and semi-supervised learning

A semi-supervised learning and deep learning technology, which is applied in the field of webshell detection based on deep learning and semi-supervised learning, can solve the problems of artificial error in analysis results, high false positive rate, and difficulty in supervised learning models, so as to reduce the false negative rate. and false positive rate, the effect of improving performance

Active Publication Date: 2018-11-16
BEIJING WANGSIKEPING TECH
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the webshell operation will not leave a record in the system security log and is mixed with normal webpage files, it is difficult for general administrators to see traces of intrusion
[0003] In the field of web security detection, due to the lack of samples, it is difficult to establish an accurate supervised learning model, a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webshell detection method and apparatus based on deep learning and semi-supervised learning
  • Webshell detection method and apparatus based on deep learning and semi-supervised learning
  • Webshell detection method and apparatus based on deep learning and semi-supervised learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0040] see figure 1 and figure 2 , a webshell detection method based on deep learning and semi-supervised learning, the detection method comprising the following steps:

[0041] S1: Obtain marked and unmarked samples, select marked samples for word segmentation, and analyze the correlation between feature words and tags through chi-square test, and select the top K feature words with the highest correlation as screening feature words ;

[0042] S2: Use the screening feature words to filter the unmarked samples with feature words, and use them as unmarked sample features;

[0043] S3: Use the neural network algorithm Doc2vec to train the obtained unlabeled sample features to obtain the text vector of each unlabeled sample;

[0044] S4: Use the unsupervised learning method to train the single-category SVDD model on the text vector o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Webshell detection method and apparatus based on deep learning and semi-supervised learning. The method comprises the following steps: obtaining original training samples, selecting labeled samples to perform word segmentation processing, analyzing the correlation between feature words and labels by chi-square test, and selecting the previous K feature words with the strongest correlation as screening feature words; performing feature word screening on unlabeled samples by using the screening feature words to serve as unlabeled sample features; training the obtained unlabeled sample features by using a neural network algorithm to obtain text vectors of the unlabeled samples; training a single-class SVDD model by using an unsupervised method, and optimizing a hypersphere radius to the minimum, wherein the maximum case comprises the unlabeled samples; and for a new labeled sample, performing incremental training on the SVDD model by using an online learning method to correct the single-class SVDD model; and applying the latest model to the prediction of new samples. By adoption of the Webshell detection method and apparatus provided by the invention, the missing report rate and the false reporting rate of the traditional webshell detection can be effectively improved.

Description

technical field [0001] The invention relates to the technical field of webshell detection, in particular to a webshell detection method and device based on deep learning and semi-supervised learning. Background technique [0002] With the development of the Internet, Web applications based on B / S architecture are rapidly popularized, including applications in government, banks, operators, e-commerce, and major portal websites. Due to the differences in the level of R&D personnel of different Web systems, it is inevitable that security issues are not considered during the design process, resulting in frequent occurrence of Web security issues. Common security threats include: SQL injection vulnerabilities, file upload vulnerabilities, form submission vulnerabilities, cross-site scripting attacks, etc. After obtaining the vulnerability of the web system, the intruder will upload the web shell to obtain the operating authority of the web server. For intruders, Webshell is a b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/06G06F21/57G06N3/08
CPCG06F21/577G06N3/08H04L63/1416H04L63/1483
Inventor 吴斌赵力朱和稳韩传富
Owner BEIJING WANGSIKEPING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products