Network traffic classification method and system based on federal semi-supervised learning
A semi-supervised learning and network traffic technology, which is applied in the field of network traffic classification based on federated semi-supervised learning, can solve problems such as data islands, huge labor costs and time costs, and the overfitting model does not have universality. Large, the effect of solving data islands
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0040] Such as figure 1 As shown, what is provided in this embodiment is a network traffic classification method based on federated semi-supervised learning, which method includes:
[0041] S1. K clients obtain local unlabeled network data, extract time-related features in each network flow sample according to the enhanced sampling method, and form an unlabeled network data set D based on time series u , there are a small number of time series-based labeled network datasets D in the federation server s ;
[0042] Specifically, the unlabeled network dataset means D u is composed of local unlabeled network flows among K clients, where Indicates that the kth client has training samples x of N instances i , and the data distribution of each client's local dataset is the same. Labeled Network Dataset means D s is composed of N labeled data streams, where x i is the training sample, y i is the training sample x i the corresponding label;
[0043] The enhanced sampling...
Embodiment 2
[0072] like Figure 5 As shown, based on the same inventive concept, the present invention also provides a network traffic classification system based on federated semi-supervised learning, including: a data preprocessing module, a client pre-training module, and a server retraining module.
[0073] The data preprocessing module is used for K clients to obtain local unlabeled network data, and obtain a large number of unlabeled data sets D through the enhanced sampling method to obtain the local unlabeled network flow of the client and a small amount of labeled network flow in the server. u and a small amount of labeled dataset D s . The enhanced sampling method has three important parameters (l, α, β). To sample data packets with a spacing of l between network flows, multiply β by 1 after sampling α times, so that l gradually increases. And the head of each network flow is sampled several times (for example, 100 times), and the timing characteristics (arrival time and lengt...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


