Traffic classification method and system based on federal semi-supervised learning
By employing a federated semi-supervised learning method, the client and central server collaborate in training, utilizing unlabeled and labeled data to decompose model parameters. This approach addresses the issues of data silos and high labeling costs in network traffic classification, achieving high-accuracy and low-cost traffic classification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
- Filing Date
- 2022-09-15
- Publication Date
- 2026-06-12
Smart Images

Figure CN115563532B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of network security technology, and in particular to a traffic classification method and system based on federated semi-supervised learning. Background Technology
[0002] Traffic classification is a crucial task in management and security, playing an indispensable role in control, planning, intrusion detection, and traffic trend analysis. With the explosive growth of traffic, traffic classification, as a core management component, requires more efficient and lower-cost models. However, traffic classification based on port identification, load matching, and machine learning all rely on manually crafted statistical features, requiring observation of the entire or most of the flow to obtain these features, such as average packet length, flow duration, and average packet arrival time. Therefore, accurate and efficient real-time traffic classification is not feasible.
[0003] Compared to traffic classification methods that require manual feature extraction, deep learning-based classification methods integrate feature extraction and model training into a unified end-to-end model, automatically learning features from raw traffic and classifying them, thus gaining favor among researchers. While deep learning-based methods avoid cumbersome feature extraction operations, existing deep learning-based traffic classification models lack consideration for practical applications, and the following issues still exist in practical applications of deep learning methods.
[0004] First, the problem of data silos.
[0005] In the field of network traffic classification, traffic data collected from user devices often contains private information about user network behavior. Not only do users not want this information to be made public, but laws and regulations also prohibit commercial companies from disclosing or sharing this user data. This has led to data silos in the industry, where companies and institutions can only store and use their own internal data. Training with highly homogenized data in this way easily results in overfitting models, leading to traffic classification models that lack general applicability. Furthermore, the lack of user traffic information from certain perspectives severely impacts the application of deep learning technology in network traffic classification tasks, significantly reducing classification accuracy.
[0006] Second, the problem of scarce labeled data.
[0007] Currently, the mainstream deep learning-based traffic classification method is still supervised learning, which requires collecting a large amount of labeled data to train the model. However, in reality, most of the captured traffic data is unlabeled, and due to the complexity of knowledge in the field of computer networks, labeling traffic data requires expert experience. Labeling all of it would consume huge amounts of manpower and time.
[0008] Third, the issue of transmission costs.
[0009] Simply applying unsupervised learning and federated learning to traffic classification tasks is not feasible. Current network traffic is massive and the environment is complex. Traffic classification models need to be trained and updated in real time. The concurrent transmission of classification models will consume huge amounts of service bandwidth. Insufficient bandwidth allocation will lead to network paralysis and slow upload and download speeds. If the bandwidth is too large, it will consume a lot of bandwidth resources, and the economic cost will be very high. Summary of the Invention
[0010] To address at least some of the problems existing in deep learning for real-world network traffic classification tasks, such as data silos, scarcity of labeled traffic data, and communication costs, this invention provides a traffic classification method and system based on federated semi-supervised learning, which can obtain a traffic classification model with high accuracy, wide applicability, and full protection of user privacy.
[0011] On the one hand, this invention provides a traffic classification method based on federated semi-supervised learning, comprising:
[0012] Step 1: The client captures unlabeled network traffic from the local gateway and preprocesses it to form an unlabeled traffic dataset; the central server preprocesses the labeled network traffic to form a labeled traffic dataset.
[0013] Step 2: The central server selects the traffic classification model used by the global model, decomposes the global model into supervised learning parameters and unsupervised learning parameters, and initializes the two types of learning parameters; it also initializes the auxiliary agent; and sends the initialized two types of learning parameters and the auxiliary agent to each client.
[0014] Step 3: The client performs unsupervised training using the local unlabeled traffic dataset based on supervised learning parameters, unsupervised learning parameters, and auxiliary agents. It updates the unsupervised learning parameters, obtains the difference between the unsupervised learning parameters before and after the update, and then uploads the difference between the unsupervised learning parameters to the central server.
[0015] Step 4: The central server aggregates and updates the unsupervised learning parameters of each client and obtains the difference between the unsupervised learning parameters before and after the update; it performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update; then it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client; and it uses the nearest neighbor search to obtain the H local unsupervised learning parameters that are most similar to the current unsupervised learning parameters as new auxiliary agents, and sends the new auxiliary agents to each client when the set sending conditions are met.
[0016] Step 5: Iterate through steps 3 to 4 until the stopping condition is met. The global model at this point is the final traffic classification model.
[0017] Furthermore, in step 2, the ResNet9 network model is used as the traffic classification model.
[0018] Furthermore, in step 3, the unsupervised training process on the client side specifically includes:
[0019] Freeze the supervised learning parameters σ and perform unsupervised training using the local unlabeled traffic dataset u to obtain a new model. Right now: At the same time, the updated unsupervised learning parameters ψ are obtained;
[0020] The consistency loss term minimized during unsupervised training is given by formula (1):
[0021]
[0022] Where * represents the freeze parameter, Indicates an auxiliary agent, η u Indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector. and The parameter λ is set to prevent unsupervised training from affecting the parameters of supervised learning. ICCS This represents the hyperparameter used to control unsupervised learning, and Φ(.) is the consistency regularization between the local model and the auxiliary agent.
[0023] Furthermore, Φ(.) is expressed using formula (2):
[0024]
[0025] in, It is an auxiliary agent. These are pseudo-tags output by the integrated auxiliary agent. This represents the labels generated based on softmax, where MAX(.) indicates that the label is output on the class with the greatest consistency, and π(u) is the random augmentation operation performed on the unlabeled traffic dataset u. It is a loss of consistency among auxiliary agents.
[0026] Furthermore, in step 6, the supervised training process of the central server specifically includes:
[0027] Supervised training is performed using a local labeled traffic dataset s to obtain a new model. Right now: At the same time, the updated supervised learning parameters σ are obtained;
[0028] The loss term minimized during supervised training is shown in formula (3):
[0029]
[0030] Where * represents the freeze parameter, η s Indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector, λ. s These are the hyperparameters used to control supervised learning.
[0031] Furthermore, in step 1, data preprocessing includes: sequentially dividing, cleaning, standardizing the length of the traffic data, and visualizing it to obtain a traffic data image.
[0032] Furthermore, the aforementioned division of traffic data specifically includes: dividing the Pacp file into different bidirectional sessions based on source IP, destination IP, source port, destination port, and transport layer protocol;
[0033] The cleaning of traffic data specifically includes: deleting duplicate data packets and empty data packets, iterating through all bidirectional session data packets, and deleting information unrelated to traffic classification;
[0034] The aforementioned standardization of traffic data length specifically includes: standardizing the length of each session to a fixed number of bytes; truncating the session if the session length is greater than the fixed number of bytes, and padding the end of the session with zeros if the session length is less than the fixed number of bytes; and / or padding the end of the UDP segment header with zeros to make it equal to the length of the TCP header, thereby making the transmission layers more uniform.
[0035] Furthermore, for the r-th communication process, the central server aggregates the unsupervised learning parameters of A clients based on model similarity, that is: This represents the unsupervised learning parameters of client a during the r-th communication process.
[0036] Furthermore, in step 4, the sending condition is that the central server sends data to the client at fixed intervals.
[0037] On the other hand, the present invention provides a traffic classification system based on federated semi-supervised learning, comprising:
[0038] The traffic preprocessing module is set up on the client and the central server respectively. It is used to preprocess the unlabeled network traffic captured by the local gateway on the client to form an unlabeled traffic dataset, and to preprocess the labeled network traffic on the central server to form a labeled traffic dataset.
[0039] The server initialization module, located on the central server, is used to select the traffic classification model adopted by the global model, decompose the global model into supervised learning parameters and unsupervised learning parameters and initialize the two types of learning parameters; initialize the auxiliary agent; and send the initialized two types of learning parameters and the auxiliary agent to each client.
[0040] The client-side training module, set up on the client side, is used to perform unsupervised training based on supervised learning parameters, unsupervised learning parameters, and an auxiliary agent using a local unlabeled traffic dataset. It updates the unsupervised learning parameters, obtains the difference between the unsupervised learning parameters before and after the update, and then uploads the difference between the unsupervised learning parameters to the central server.
[0041] The server retraining module, located on the central server, is used to aggregate the unsupervised learning parameters from each client and obtain the difference between the unsupervised learning parameters before and after aggregation. It performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update. Then, it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client. It also obtains H most similar local models based on nearest neighbor search as new auxiliary agents and sends the new auxiliary agents to each client when the set sending conditions are met.
[0042] The beneficial effects of this invention are:
[0043] 1. This invention trains a network traffic classification model using a federated semi-supervised learning architecture, which can assist multiple parties in jointly learning an accurate and universal neural network model without disclosing and sharing local user traffic data on the client side. Each participant, i.e., the client, can train independently on its own user dataset, and only needs to selectively share the model parameters learned from the local dataset during training. This training method, which assists multiple parties in training without collecting local data, solves the data silo problem in the traffic domain and cleverly resolves the problem of exposing user privacy data.
[0044] 2. This invention uses a classification model based on a convolutional neural network to perform semi-supervised learning in a federated environment. Semi-supervised learning uses a large amount of local unlabeled traffic while using a small amount of expert-labeled data to train the classification model, which can effectively solve the problem of high cost of data labeling in real-world network traffic classification tasks.
[0045] 3. This invention proposes a client-to-client consistency loss method for semi-supervised learning. It utilizes clients in similar network segments as the source of consistency perturbation in semi-supervised learning, maximizes the use of consensus among clients in similar network segments, and effectively accelerates the training speed.
[0046] 4. This invention uses a model update strategy based on parameter decomposition to perform federated semi-supervised learning. It decomposes the model into unsupervised learning parameters and supervised learning parameters, and preserves reliable knowledge from labeled data. This not only prevents the model from forgetting the knowledge learned from labeled data when the proportion of unsupervised learning is too large, but also effectively prevents interference between tasks. Furthermore, it can further reduce communication costs. Attached Figure Description
[0047] Figure 1 This is a schematic diagram illustrating an application scenario provided by an embodiment of the present invention;
[0048] Figure 2 A flowchart illustrating the traffic classification method based on federated semi-supervised learning provided in an embodiment of the present invention;
[0049] Figure 3 This is one of the schematic diagrams of a traffic classification system based on federated semi-supervised learning provided in an embodiment of the present invention;
[0050] Figure 4 This is the second schematic diagram of the framework of a traffic classification system based on federated semi-supervised learning provided in an embodiment of the present invention. Detailed Implementation
[0051] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of the embodiments of this invention will be clearly described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.
[0052] Example 1
[0053] like Figure 1 As shown, the embodiments of the present invention are mainly applied to communication scenarios between a central server and multiple clients in a federated environment. Figure 2 As shown, this embodiment of the invention provides a traffic classification method based on federated semi-supervised learning, including the following steps:
[0054] S101: The client captures unlabeled network traffic from the local gateway and preprocesses it to form an unlabeled traffic dataset; the central server preprocesses labeled network traffic to form a labeled traffic dataset.
[0055] S102: The central server selects the traffic classification model used by the global model, decomposes the global model into supervised learning parameters and unsupervised learning parameters, and initializes the two types of learning parameters; and initializes the auxiliary agent; and sends the initialized two types of learning parameters and the auxiliary agent to each client;
[0056] S103: The client performs unsupervised training using the local unlabeled traffic dataset based on supervised learning parameters, unsupervised learning parameters, and auxiliary agents. It updates the unsupervised learning parameters, obtains the difference between the unsupervised learning parameters before and after the update, and then uploads the difference between the unsupervised learning parameters to the central server.
[0057] S104: The central server aggregates the unsupervised learning parameters from each client and obtains the difference between the unsupervised learning parameters before and after aggregation; it performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update; then it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client; and it obtains the H local unsupervised learning parameters that are most similar to the current unsupervised learning parameters based on nearest neighbor search as new auxiliary agents, and sends the new auxiliary agents to each client when the set sending conditions are met.
[0058] S105: Iterate through steps S103 to S104 until the stopping condition is met. The global model at this point is the final traffic classification model.
[0059] The traffic classification method based on federated semi-supervised learning provided in this invention can assist multiple parties in jointly learning an accurate and universal neural network model without disclosing and sharing local user traffic data on the client side. Each participant, i.e., the client, can train independently on its own user dataset, and only needs to selectively share the model parameters trained on the local dataset during training. This training method, which assists multiple parties in training without collecting local data, not only solves the data silo problem in the traffic domain, but also cleverly resolves the problem of exposing user privacy data.
[0060] Furthermore, this invention proposes a model update strategy based on parameter decomposition. This update strategy is used to perform federated semi-supervised learning to preserve reliable knowledge from labeled data. On the one hand, it can prevent the model from forgetting the knowledge learned from labeled data when the proportion of unsupervised learning is too large, thereby effectively preventing interference between tasks. On the other hand, it can further reduce communication costs.
[0061] Example 2
[0062] To maximize the use of consensus among clients in similar network segments, this invention provides another traffic classification method based on federated semi-supervised learning, building upon the above embodiments. This invention primarily proposes specific training methods for unsupervised and supervised training processes. Specifically, this invention includes the following steps:
[0063] S201: The client captures the local unlabeled data stream, preprocesses it, and then constructs an unlabeled traffic dataset. l 'l' represents local; the central server preprocesses the labeled network traffic to form a labeled dataset 's'.
[0064] Specifically, the unlabeled traffic datasets of each client are not independent and identically distributed; the labeled network dataset s is labeled traffic annotated by experts, containing several data pairs (x, y), where x is the data stream and y is its corresponding label.
[0065] Data preprocessing includes: sequentially segmenting, cleaning, standardizing length, and visualizing the traffic data to obtain a traffic data image. Segmenting the traffic data specifically includes: dividing the raw traffic into different bidirectional sessions according to source IP, destination IP, source port, destination port, and transport layer protocol in the Pacp file; cleaning the traffic data specifically includes: deleting duplicate and empty data packets, iterating through all bidirectional session data packets, and deleting information irrelevant to traffic classification (such as MAC addresses); standardizing the length of the traffic data specifically includes: standardizing the length of each session to a fixed number of bytes, truncating if the session length is greater than the fixed number of bytes, and padding with zeros at the end if it is less than the fixed number of bytes. Furthermore, zeros are padded to the end of the UDP segment header (8 bytes) to make it equal to the length of the TCP header (20 bytes), ensuring uniform transport layer segmentation.
[0066] S202: The central server uses the ResNet9 network model as the traffic classification model, and its parameters are denoted as global model parameters θ. G and set the global model parameter θ G The parameters are decomposed into supervised learning parameters σ and unsupervised learning parameters ψ and initialized separately; among them, parameter ψ will be trained locally on the client, and parameter σ will be trained on the central server; and the auxiliary agent is initialized.
[0067] Specifically, the ResNet9 traffic classification model in this embodiment includes multiple convolutional layers, pooling layers, residual connections, and a Softmax output layer; the auxiliary agent is one of the key design points of this embodiment, mainly used to maximize the use of consensus between clients in similar network segments.
[0068] S203: The central server randomly selects A clients from all clients to participate in the following model training task, and sends the relevant training parameters to the selected A clients;
[0069] Specifically, due to considerations such as controlling the training scale and the actual online status of clients, not all clients will participate in the training task in actual applications. Therefore, the central server selects the clients that need to participate in the training task.
[0070] During the first communication, the central server sends the following training parameters to each client: the initial supervised learning parameters σ and the unsupervised learning parameters ψ, as well as the initial auxiliary agent.
[0071] In the second and subsequent communications, the training parameters sent by the central server to each client mainly include: differences in supervised learning parameters and differences in unsupervised learning parameters; in addition, depending on the situation, new auxiliary agents may also be included.
[0072] S204: The client performs unsupervised training using a local unlabeled traffic dataset based on supervised learning parameters, unsupervised learning parameters, and an auxiliary agent. It then updates the unsupervised learning parameters and obtains the difference between the updated and unsupervised learning parameters. Then, the differences in unsupervised learning parameters are uploaded to the central server; among them, This represents the updated unsupervised learning parameters. This represents the unsupervised learning parameters before the update.
[0073] Specifically, during the first communication (also known as the first round of training), the client performs unsupervised training based on the initial supervised learning parameters, unsupervised learning parameters, and auxiliary agent using the local unlabeled traffic dataset, and updates the unsupervised learning parameters.
[0074] In the second and subsequent communication processes, the client needs to calculate new supervised learning parameters and unsupervised learning parameters based on the differences in the received supervised learning parameters and unsupervised learning parameters, as well as the supervised learning parameters and unsupervised learning parameters stored locally from the previous communication process. Then, based on the new supervised learning parameters and unsupervised learning parameters, the client performs unsupervised training using the local unlabeled traffic dataset with the auxiliary agent.
[0075] As one possible implementation method, the unsupervised training process on the client side specifically includes:
[0076] Freeze the supervised learning parameters σ and perform unsupervised training using the local unlabeled traffic dataset u to obtain a new model. Right now: At the same time, the updated unsupervised learning parameters ψ are obtained;
[0077] The consistency loss term minimized during unsupervised training is given by formula (1):
[0078]
[0079] Where * represents the freeze parameter, Indicates an auxiliary agent, η u Indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector. and The parameter λ is set to prevent unsupervised training from affecting the parameters of supervised learning. ICCS This represents the hyperparameter used to control unsupervised learning, and Φ(.) is the consistency regularization between the local model and the auxiliary agent.
[0080] Furthermore, in this embodiment, formula (2) is used to represent Φ(.):
[0081]
[0082] in, It is an auxiliary agent. These are pseudo-tags output by the integrated auxiliary agent. This represents the labels generated based on softmax, where MAX(.) indicates that the label is output on the class with the greatest consistency, and π(u) is the random augmentation operation performed on the unlabeled traffic dataset u. It is a loss of consistency among auxiliary agents.
[0083] S205: The central server aggregates and updates the unsupervised learning parameters of A clients, and obtains the difference between the unsupervised learning parameters before and after the update; it then performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update; finally, it combines the difference between the supervised learning parameters and... Unsupervised learning parameter differences Send to each client; among them, This represents the updated supervised learning parameters. This represents the supervised learning parameters before the update. This represents the updated unsupervised learning parameters. This represents the unsupervised learning parameters before the update.
[0084] Specifically, the central server can obtain new unsupervised learning parameters based on the differences in the received unsupervised learning parameters and the unsupervised learning parameters from the previous communication process stored locally.
[0085] As one possible implementation, for the r-th communication process, the central server aggregates the unsupervised learning parameters of A clients based on model similarity, that is: This represents the unsupervised learning parameters of client a during the r-th communication process.
[0086] As one possible implementation method, the supervised training process of the central server specifically includes:
[0087] Supervised training is performed using a local labeled traffic dataset s to obtain a new model. Right now: At the same time, the updated supervised learning parameters σ are obtained;
[0088] The loss term minimized during supervised training is shown in formula (3):
[0089]
[0090] Where * represents the freeze parameter, η s Indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector, λ. s These are the hyperparameters used to control supervised learning.
[0091] S206: The server is pre-set to send a new auxiliary agent to the client every 10 rounds of communication; the central server determines whether the number of rounds of communication r is a multiple of 10. If so, it obtains H most similar local models based on nearest neighbor search as new auxiliary agents, and then sends the new auxiliary agents to each client.
[0092] S207: After multiple iterations from S203 to S206, the central server performs multiple aggregation updates until the global model converges, at which point the iteration stops, and finally the parameters θ in the global model are obtained. G ;
[0093] Example 3
[0094] Corresponding to the methods mentioned above, such as Figure 3 and Figure 4 As shown, this embodiment of the invention also provides a traffic classification system based on federated semi-supervised learning, including a traffic preprocessing module, a server initialization module, a client training module, and a server retraining module;
[0095] The system comprises several modules: a traffic preprocessing module, located on both the client and central server, for preprocessing unlabeled network traffic captured by the client from the local gateway to form an unlabeled traffic dataset, and preprocessing labeled network traffic on the central server to form a labeled traffic dataset. A server initialization module, located on the central server, selects the traffic classification model used by the global model, decomposes the global model into supervised learning parameters and unsupervised learning parameters, initializes both types of learning parameters, initializes the auxiliary agent, and sends the initialized learning parameters and auxiliary agent to each client. A client training module, located on the client, performs unsupervised training using the local unlabeled traffic dataset based on the supervised learning parameters, unsupervised learning parameters, and auxiliary agent. It updates the unsupervised learning parameters, obtains the differences between the updated and unsupervised learning parameters, and then uploads these differences to the central server. The server retraining module, located on the central server, is used to aggregate the unsupervised learning parameters from each client and obtain the difference between the unsupervised learning parameters before and after aggregation. It performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update. Then, it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client. It also obtains H most similar local models based on nearest neighbor search as new auxiliary agents and sends the new auxiliary agents to each client when the set sending conditions are met.
[0096] It should be noted that the system provided in this embodiment of the invention is for implementing the above method embodiments, and its specific functions can be referred to the above method embodiments, which will not be repeated here.
[0097] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A traffic classification method based on federated semi-supervised learning, characterized in that, include: Step 1: The client captures unlabeled network traffic from the local gateway and preprocesses it to form an unlabeled traffic dataset; the central server preprocesses the labeled network traffic to form a labeled traffic dataset. Step 2: The central server selects the traffic classification model used by the global model, decomposes the global model into supervised learning parameters and unsupervised learning parameters, and initializes the two types of learning parameters; it also initializes the auxiliary agent; and sends the initialized two types of learning parameters and the auxiliary agent to each client. Step 3: The client performs unsupervised training using the local unlabeled traffic dataset based on supervised learning parameters, unsupervised learning parameters, and auxiliary agents. It updates the unsupervised learning parameters, obtains the difference between the unsupervised learning parameters before and after the update, and then uploads the difference between the unsupervised learning parameters to the central server. The unsupervised training process on the client side specifically includes: Freeze supervised learning parameters Unsupervised training was performed using the local unlabeled traffic dataset u to obtain a new model. ,Right now: At the same time, the updated unsupervised learning parameters are obtained. ; The consistency loss term minimized during unsupervised training is given by formula (1): (1) in, Indicates the freeze parameter. Indicates auxiliary agent, This indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector. and Parameters set to prevent unsupervised training from affecting supervised learning parameters. This represents the hyperparameters used to control unsupervised learning. It is a consistency regularization between the local model and the auxiliary agent; Step 4: The central server aggregates and updates the unsupervised learning parameters of each client and obtains the difference between the unsupervised learning parameters before and after the update; it performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update; then it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client; and it uses the nearest neighbor search to obtain the H local unsupervised learning parameters that are most similar to the current unsupervised learning parameters as new auxiliary agents, and sends the new auxiliary agents to each client when the set sending conditions are met. Step 5: Iterate through steps 3 to 4 until the stopping condition is met. The global model at this point is the final traffic classification model.
2. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, In step 2, the ResNet9 network model is used as the traffic classification model.
3. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, Formula (2) is used to express : (2) in, It is an auxiliary agent. These are pseudo-tags output by the integrated auxiliary agent. , This indicates labels generated based on softmax. This indicates that the label should be output on the class with the greatest consistency. It is a random augmentation operation performed on the unlabeled traffic dataset u. It is a loss of consistency among auxiliary agents.
4. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, Step 6, the supervised training process of the central server specifically includes: Supervised training is performed using a local labeled traffic dataset s to obtain a new model. ,Right now: At the same time, the updated supervised learning parameters are obtained. ; The loss term minimized during supervised training is shown in formula (3): (3) in, Indicates the freeze parameter. This indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector. These are the hyperparameters used to control supervised learning.
5. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, In step 1, data preprocessing includes: sequentially dividing, cleaning, standardizing the length of the traffic data, and visualizing it to obtain a traffic data image.
6. The traffic classification method based on federated semi-supervised learning according to claim 5, characterized in that, The aforementioned segmentation of traffic data specifically includes: dividing the Pacp file into different bidirectional sessions based on source IP, destination IP, source port, destination port, and transport layer protocol; The cleaning of traffic data specifically includes: deleting duplicate data packets and empty data packets, iterating through all bidirectional session data packets, and deleting information unrelated to traffic classification; The aforementioned standardization of traffic data length specifically includes: standardizing the length of each session to a fixed number of bytes; truncating the session if the session length is greater than the fixed number of bytes, and padding the end of the session with zeros if the session length is less than the fixed number of bytes; and / or padding the end of the UDP segment header with zeros to make it equal to the length of the TCP header, thereby making the transmission layers more uniform.
7. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, For the r-th communication process, the central server aggregates the unsupervised learning parameters of A clients based on model similarity, that is: ; This represents the unsupervised learning parameters of client a during the r-th communication process.
8. The traffic classification method based on federated semi-supervised learning according to claim 1, characterized in that, In step 4, the sending condition is that the central server sends data to the client at fixed intervals.
9. A traffic classification system based on federated semi-supervised learning, characterized in that, include: The traffic preprocessing module is set up on the client and the central server respectively. It is used to preprocess the unlabeled network traffic captured by the local gateway on the client to form an unlabeled traffic dataset, and to preprocess the labeled network traffic on the central server to form a labeled traffic dataset. The server initialization module, located on the central server, is used to select the traffic classification model adopted by the global model, decompose the global model into supervised learning parameters and unsupervised learning parameters and initialize the two types of learning parameters; initialize the auxiliary agent; and send the initialized two types of learning parameters and the auxiliary agent to each client. The client-side training module, set up on the client side, is used to perform unsupervised training based on supervised learning parameters, unsupervised learning parameters, and an auxiliary agent using a local unlabeled traffic dataset. It updates the unsupervised learning parameters, obtains the difference between the unsupervised learning parameters before and after the update, and then uploads the difference between the unsupervised learning parameters to the central server. The unsupervised training process on the client side specifically includes: Freeze supervised learning parameters Unsupervised training was performed using the local unlabeled traffic dataset u to obtain a new model. ,Right now: At the same time, the updated unsupervised learning parameters are obtained. ; The consistency loss term minimized during unsupervised training is given by formula (1): (1) in, Indicates the freeze parameter. Indicates auxiliary agent, This indicates the step size for parameter shift. This indicates that the parameter updates the unit direction vector. and Parameters set to prevent unsupervised training from affecting supervised learning parameters. This represents the hyperparameters used to control unsupervised learning. It is a consistency regularization between the local model and the auxiliary agent; The server retraining module, located on the central server, is used to aggregate the unsupervised learning parameters from each client and obtain the difference between the unsupervised learning parameters before and after aggregation. It performs supervised training using the local labeled traffic dataset, updates the supervised learning parameters, and obtains the difference between the supervised learning parameters before and after the update. Then, it sends the difference between the supervised learning parameters and the difference between the unsupervised learning parameters to each client. It also obtains H most similar local models based on nearest neighbor search as new auxiliary agents and sends the new auxiliary agents to each client when the set sending conditions are met.