Vertical malicious crawler traffic identification method based on deep learning

A technology of deep learning and traffic identification, applied in the field of traffic identification of malicious crawlers based on deep learning

Inactive Publication Date: 2020-07-10
GUANGDONG POLYTECHNIC NORMAL UNIV
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to solve the problem of accurately identifying malicious crawler traffic in website access traffic, and proposes a malicious crawler traffic identification method based on deep learning, which combines deep learning with automatic learning features in the model building process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vertical malicious crawler traffic identification method based on deep learning
  • Vertical malicious crawler traffic identification method based on deep learning
  • Vertical malicious crawler traffic identification method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] specific implementation

[0022] Describe the present invention in detail below in conjunction with accompanying drawing:

[0023] Such as figure 1 The overall process mainly consists of the following four steps:

[0024] Step1: Build a training data set;

[0025] Step2: Use the three-dimensional convolutional neural network to train the model;

[0026] Step3: Adjust the optimal recognition model;

[0027] Step4: Test data and complete traffic identification.

[0028] The specific implementation of Step1 is as follows:

[0029] (1) Set up a target machine in the experimental network, and deploy a target website with a certain amount of information without any defense measures on its equipment;

[0030] (2) In order to improve the speed of sample collection, the target website is completely statically processed. In order to ensure sufficient sample data and collection efficiency, the crawler program is deployed in high-performance collection nodes and general perfo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a vertical malicious crawler traffic identification method based on deep learning, and belongs to the technical field of the Internet. According to the invention, deep learningis used for flow identification of website access behaviors; identity recognition is carried out on flow with malicious crawler behaviors through classification of access behavior characteristics, and the method comprises three parts of encoding website access flow into a three-dimensional vector, carrying out recognition training through a three-dimensional convolutional neural network, and finally establishing a classification recognition model for malicious crawler flow through optimization of network parameters. The website access flow is classified and identified through deep learning, the malicious crawler flow identification accuracy of the website is improved, and the website can deploy the corresponding security policy according to the identification result so as to improve the website performance and reduce the website redundant load.

Description

technical field [0001] The invention belongs to the technical field of computer network security, and in particular relates to a method for identifying malicious crawler traffic based on deep learning. [0002] technical background [0003] Crawlers are one of the most widely used technologies on the Internet today. They have been used in many fields such as finance, trade, and information technology. The preliminary research and data collection of many tasks are completed by crawler programs, and the crawled content is cleaned and processed. , the obtained data is extremely valuable. [0004] It is worth noting that in order to obtain the largest amount of data in the shortest time, some crawlers will use multi-threading, high concurrency, and even distributed technologies, which will greatly increase the pressure on the server. We classify the traffic generated by such crawlers as malicious crawler traffic. This kind of traffic brings enormous pressure on the server. In or...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/06H04L29/08G06N3/04G06N3/08G06K9/62
CPCH04L63/1425H04L67/02G06N3/08G06N3/048G06N3/045G06F18/24
Inventor 刘兰刘浪洲王鹏铖
Owner GUANGDONG POLYTECHNIC NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products