Encrypted traffic identification method based on ensemble learning

An integrated learning and flow recognition technology, applied in character and pattern recognition, digital transmission systems, instruments, etc., can solve problems such as unbalanced number of class samples, unbalanced data flow distribution, and underfitting.

Active Publication Date: 2020-07-07
NANJING UNIV OF INFORMATION SCI & TECH
View PDF2 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the data stream distribution of various encryption applications in the actual network is very uneven. For example, the audio and video streams carried by encrypted protocols are much larger than instant messaging and pure web encrypted streams, etc. The data streams of encryption protocols such as SSH and IPsec are far less than HTTPS protocol
Network application flow category imbalance refers to the imbalance in the number of category samples in the data set. Through training, these classification algorithms may ignore the flow samples of a few categories, resulting in underfitting, or pay attention to the differences of minority categories, resulting in overfitting

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Encrypted traffic identification method based on ensemble learning
  • Encrypted traffic identification method based on ensemble learning
  • Encrypted traffic identification method based on ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings.

[0020] The present invention provides a method for identifying encrypted traffic based on integrated learning. Aiming at the problems of unbalanced categories of sample data sets, difficulty in feature extraction, and feature redundancy, the original data set is balanced by the SMOTE algorithm, the data packet load is extracted, and the stack is used to The automatic encoder model automatically extracts features, and finally inputs the classifier based on ensemble learning for classification evaluation.

[0021] like figure 1 As shown, the process of encrypted traffic identification method based on ensemble learning includes at least several steps: data set collection, data preprocessing, balancing data set, automatic feature extraction, traffic identification and index result analysis.

[0022] Data set collection is to use Wireshark to capt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an encrypted traffic identification method based on ensemble learning. The encrypted traffic identification method is characterized by comprising the following steps of: (1) collecting a data set; (2) preprocessing the data; (3) balancing the data set; (4) automatically extracting features; (5) identifying the flow; (6) and analyzing an obtained index result, selecting proper parameters, and optimizing a algorithm. According to the encrypted traffic identification method, the problem of model under-fitting or over-fitting caused by imbalance of sample categories is solved, the recognition rate is high, the false alarm rate is low, and the encrypted traffic identification method is suitable for encrypted traffic recognition of category imbalance and difficult featureextraction of the data set.

Description

technical field [0001] The invention relates to a method for identifying encrypted traffic based on integrated learning. Background technique [0002] Traffic classification and identification are the basis for improving network management and security monitoring, improving service quality, and also the prerequisite for network behavior such as network design and planning. With the rapid development of network technology, more and more network applications use encryption protocols to ensure the safe transmission of information in the network, and encrypted traffic occupies an increasing proportion of actual network traffic. However, due to the concealment of encrypted traffic, it often becomes the carrier of network attacks. In recent years, network security incidents have intensified. The reason for this is that network security issues have not received enough attention. Network attacks often use encrypted network traffic as a carrier to continuously attack the system netwo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/24H04L12/851H04L29/06G06K9/62
CPCH04L63/0428H04L63/0227H04L63/1408H04L41/145H04L47/2441H04L47/2483G06F18/2193G06F18/24143G06F18/24323Y02D30/50
Inventor 翟江涛崔永富林鹏吉小鹏石怀峰张艳艳付章杰
Owner NANJING UNIV OF INFORMATION SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products