Malicious encrypted flow feature analysis method

A traffic feature analysis and traffic feature technology, applied in the intersection of network security and machine learning, can solve problems such as difficulty in effectively extracting malicious encrypted traffic, and achieve good interpretability.

Inactive Publication Date: 2019-08-09
BEIJING UNIV OF TECH
View PDF5 Cites 47 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] In order to solve the problems in existing methods that it is difficult to effectively extract malicious encrypted traffic features and optimize the selection of extracted features, the present invention takes connection quadruples as the basic unit and comprehensively analyzes encrypted traffic at four levels, especially HTTPS traffic.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Malicious encrypted flow feature analysis method
  • Malicious encrypted flow feature analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0020] The present invention uses a traffic analysis and detection tool to clean the collected traffic data files, filters out encrypted traffic, and marks the traffic according to the existing traffic label files in the data set, and the marks are divided into malicious traffic, normal traffic and background traffic. Filter background traffic, preprocess the filtered traffic data in units of connection quadruples, add the HTTPS flow with the same source IP address, destination IP address, destination port number and transport layer protocol, and the corresponding context traffic DNS flow in the corresponding connection quadruple structure.

Embodiment 2

[0022] The invention analyzes the flow by taking the connection quadruple as a unit, and extracts the corresponding flow level feature, TLS handshake feature, certificate feature and context feature. The characteristics of the four types of traffic are as follows:

[0023] (1) Stream-level features

[0024] The average value, maximum value, and standard deviation of the inter-arrival time of the flow, the average value, maximum value, and standard deviation of the connection duration, the average value of the data packet size, the average value of the number of data packets, and the ratio of the size of the sent and received data packets , the ratio of the number of data packets sent and received, the number of lost packets, and the ratio of normal connection status.

[0025] (2) TLS handshake characteristics

[0026] TLS / SSL protocol version, encryption components and extensions provided by the client during the TLS handshake phase, encryption components and extensions sele...

Embodiment 3

[0033] The present invention uses a decision tree-based recursive feature elimination method to perform feature selection on the above-mentioned initial feature set. The recursive feature elimination method sorts the importance of the initial feature set during model training, and removes the least important features after each round of training to form a new feature set, and then trains on the new feature set, repeating the above Process until the size of the remaining feature sets reaches the threshold, and generate the optimal feature set. The optimal feature set and sample labels form training samples.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a malicious encrypted flow feature analysis method, which belongs to the crossing field of network security and machine learning, and is used for analyzing the HTTPS flow and detecting the malicious threats in the HTTPS flow. The method comprises the following steps of modeling the flow data by taking a connection quaternion as a data structure, analyzing the encrypted flowfrom four levels by taking the connection quaternion as a unit, and extracting the flow-level features, TLS handshake features, X.509 certificate features and context features of the connection quaternion to obtain an initial feature set, adopting a recursive feature elimination method based on a decision tree to screen the initial feature set to obtain an optimal feature set used for the machinelearning model training to thereby obtain a malicious encrypted flow detection model.

Description

[0001] Technical field: [0002] The invention belongs to the intersecting field of network security and machine learning, and relates to a method for analyzing characteristics of maliciously encrypted traffic. [0003] Background technique: [0004] It is the general trend for the global Internet to move towards the era of comprehensive encryption. Encryption technology can guarantee communication security and user privacy, and more and more enterprise services and application software use encryption technology as the main means to ensure information security. However, encrypted traffic also brings new challenges and threats to the security field. Through encrypted channels, attackers can bypass the detection system to carry out malicious infringement. [0005] Since the machine learning algorithm using traffic metadata is less affected by whether the traffic is encrypted or not, the use of machine learning algorithms characterized by traffic metadata has become the focus of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06
CPCH04L63/1425H04L63/166H04L63/168
Inventor 刘静袁新雨赖英旭
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products