Network traffic classification method based on semi-supervised learning and computer device

A semi-supervised learning and network traffic technology, applied in the field of computer equipment, network traffic classification methods based on semi-supervised learning, can solve problems such as no use, no solution, and impact on the accuracy of protocol classification, so as to achieve extraction and improve The effect of accuracy

Inactive Publication Date: 2018-03-20
BEIJING UNIV OF POSTS & TELECOMM
View PDF8 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] First, the labeled data, the labeled stream, is not fully utilized
In the classic semi-supervised classification method and the later improved method, the label flow is only used for cluster identification, and it is not fully utilized.
[0005] Second, in actual scenarios, when identifying clustering results using tagged streams, the unknown protocol cluster is often not considered. If there is a small amount of tagged data that is misclassified to the cluster, the cluster will be misclassified to a certain cluster. Among the known protocol categories, the online classifier trained using such clustering results will seriously affect the classification accuracy of this type of protocol and the accuracy of extraction of unknown protocols, resulting in a decrease in the accuracy of the online classifier
[0006] For the above-mentioned problems in the prior art, there is no effective solution yet

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network traffic classification method based on semi-supervised learning and computer device
  • Network traffic classification method based on semi-supervised learning and computer device
  • Network traffic classification method based on semi-supervised learning and computer device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0038] Such as figure 1 As shown, what Embodiment 1 of the present invention provides is that the present invention provides a network traffic classification method based on semi-supervised learning, and the network traffic classification method includes:

[0039] S1, obtain marked and unmarked network flows, extract the flow characteristics of each network flow according to a preset fixed amount, and obtain the network flow feature vector;

[0040] S2. Calculate the information gain of each flow feature in the preset fixed amount according to the marked type of network flow, and perform feature weighting on each flow feature according to the information gain;

[0041] S3, mixing marked and unmarked network flows, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a network traffic classification method based on semi-supervised learning. The method comprises the steps of obtaining network traffic of which type is marked and not marked and extracting traffic feature in each piece of network traffic to obtain network traffic feature vectors; calculating information gain of each traffic feature through utilization of marked data and carrying out feature weighting; mixing and clustering the network traffic of which type is marked and not marked; obtaining the number of the marked network traffic in each cluster and determining a proportion of each type in each cluster; when the sum of the total number of the marked network traffic in clusters is smaller than a preset network traffic threshold value, judging that the clusters areunknown protocol clusters, otherwise, judging that the clusters as the types with the maximum proportions in the marked network traffic; repeating the steps until the traffic types of the traffic clusters are judged, and training an online real-time classifier through utilization of the traffic clusters. The invention relates to a computer device. The device comprises a processor, a memory and acomputer program which is stored on the memory and can be operated on the processor.

Description

technical field [0001] The invention belongs to the field of network flow management, and in particular relates to a network flow classification method and computer equipment based on semi-supervised learning. Background technique [0002] Most traditional network flow-based methods combine supervised or unsupervised machine learning algorithms to achieve network traffic classification. In supervised traffic classification, a learning engine takes a set of labeled flow samples, trains against predefined protocol categories, and returns a trained classification model that can predict the protocol type of future network flows. However, with the rapid expansion of the network, many new applications are deployed on the Internet, and the unknown flows corresponding to these applications cannot be handled by supervised learning-based classification methods. In this case, the unknown traffic will be wrongly classified into some predefined traffic class and affect the overall accur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/851H04L12/26G06K9/62
CPCH04L43/18H04L47/2441G06F18/23213
Inventor 冉静孔晓晨刘元安胡鹤飞袁东明
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products