Unlock instant, AI-driven research and patent intelligence for your innovation.

Network traffic classification method and system based on federal semi-supervised learning

A semi-supervised learning and network traffic technology, which is applied in the field of network traffic classification based on federated semi-supervised learning, can solve problems such as data islands, huge labor costs and time costs, and the overfitting model does not have universality. Large, the effect of solving data islands

Pending Publication Date: 2021-11-26
GUANGZHOU UNIVERSITY
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its accuracy is very high, but there are still disadvantages, such as high computational complexity, inability to handle encrypted traffic, etc.
Usually users do not want this information to be disclosed. However, without enough user information, the application of deep learning technology in network traffic classification tasks will be seriously affected, and even a usable model cannot be trained.
[0007] Second, the data island problem
The overfitting model obtained by this specific data training is not universal, and the accuracy of classification will be greatly reduced in practical applications.
[0009] Third, the problem of scarcity of labeled data
However, in reality, most of the collected user data are unlabeled, and due to the complexity of knowledge in the computer network field, a large number of professionals are required to label traffic data, which will consume huge labor and time costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network traffic classification method and system based on federal semi-supervised learning
  • Network traffic classification method and system based on federal semi-supervised learning
  • Network traffic classification method and system based on federal semi-supervised learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0040] Such as figure 1 As shown, what is provided in this embodiment is a network traffic classification method based on federated semi-supervised learning, which method includes:

[0041] S1. K clients obtain local unlabeled network data, extract time-related features in each network flow sample according to the enhanced sampling method, and form an unlabeled network data set D based on time series u , there are a small number of time series-based labeled network datasets D in the federation server s ;

[0042] Specifically, the unlabeled network dataset means D u is composed of local unlabeled network flows among K clients, where Indicates that the kth client has training samples x of N instances i , and the data distribution of each client's local dataset is the same. Labeled Network Dataset means D s is composed of N labeled data streams, where x i is the training sample, y i is the training sample x i the corresponding label;

[0043] The enhanced sampling...

Embodiment 2

[0072] like Figure 5 As shown, based on the same inventive concept, the present invention also provides a network traffic classification system based on federated semi-supervised learning, including: a data preprocessing module, a client pre-training module, and a server retraining module.

[0073] The data preprocessing module is used for K clients to obtain local unlabeled network data, and obtain a large number of unlabeled data sets D through the enhanced sampling method to obtain the local unlabeled network flow of the client and a small amount of labeled network flow in the server. u and a small amount of labeled dataset D s . The enhanced sampling method has three important parameters (l, α, β). To sample data packets with a spacing of l between network flows, multiply β by 1 after sampling α times, so that l gradually increases. And the head of each network flow is sampled several times (for example, 100 times), and the timing characteristics (arrival time and lengt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of network flow classification, in particular to a network flow classification method and system based on federal semi-supervised learning. The system comprises: a data preprocessing module, which is used for acquiring local label-free network flow of clients and a label network flow in a server through an enhanced sampling method so as to obtain a label-free data set and a marked data set; a client pre-training module, which is used for executing local unsupervised training by clients, learning features of local data on each client through an auto-encoder model, and using the learned data features for training a classifier; and a server retraining module, which is used by a server for executing supervised training, and retraining the retrained model by using the marked data on a federated server to obtain a universal classifier for the clients to classify network flow. According to the invention, on the premise of protecting user data privacy, multiple parties can be assisted to jointly learn an accurate and universal network flow classification model under the condition that a local user data set is not disclosed and shared.

Description

technical field [0001] The invention relates to the field of network traffic classification, in particular to a network traffic classification method and system based on federated semi-supervised learning. Background technique [0002] The goal of the network traffic classification task is to classify Internet traffic into predefined categories such as: normal or abnormal traffic, application type or application name. Network traffic classification plays an important role in both network management and network security. Its main applications are as follows: first, it has important applications in network monitoring and management, traffic accounting, user behavior analysis, etc.; 2. Used in intrusion detection systems and firewalls to identify malicious traffic and block malicious traffic in a timely manner; 3. Know the proportion of various network applications, predict the development trend of network services, and plan the network reasonably. [0003] In the early days o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/241G06F18/214Y02D30/50
Inventor 王宇彭瑶何美蓉崔田莹
Owner GUANGZHOU UNIVERSITY