A network traffic classification method and device based on multi-dimensional flow level features, equipment and storage medium

By employing a network traffic classification method based on multi-dimensional flow-level features, utilizing quintuples and time threshold grouping, and combining multiple model filtering and classifiers, the problem of difficulty in identifying new attack variants and unknown threats in existing technologies is solved, achieving efficient and accurate traffic classification.

CN121567434BActive Publication Date: 2026-06-19HANGZHOU DBAPPSECURITY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HANGZHOU DBAPPSECURITY CO LTD
Filing Date
2025-12-03
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies are insufficient to effectively identify new attack variants and unknown threats. Rule bases require continuous manual maintenance and updates, and traditional feature engineering relies on domain expert knowledge, making it difficult to adapt to the identification of diverse attack behaviors. This results in low model training efficiency and impaired generalization performance.

Method used

A network traffic classification method based on multi-dimensional flow-level features extracts static attributes, temporal behavior, protocol interactions, and network flow context features by grouping them into quintuples and preset time interval thresholds. It uses the maximum correlation minimum redundancy algorithm to filter features, and combines random forest, extreme random tree, lightweight gradient boosting, and XGBoost models to further filter and evaluate features. Finally, it uses support vector machine, logistic regression, and multilayer perceptron for classification.

Benefits of technology

It achieves accurate differentiation between normal and malicious traffic, significantly improving the accuracy and efficiency of traffic classification, and can identify encrypted malicious traffic in complex network environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121567434B_ABST
    Figure CN121567434B_ABST
Patent Text Reader

Abstract

This application discloses a network traffic classification method, apparatus, device, and storage medium based on multi-dimensional flow-level features, relating to the field of network security technology. The method includes: grouping raw data packets based on a 5-tuple and a preset time interval threshold; determining the target network flow using the grouped data packets; constructing a feature vector based on the target flow-level features in the target network flow; performing preliminary screening of the feature vector based on the maximum correlation minimum redundancy algorithm; evaluating the obtained first-screened features using a target base model; ranking them based on the obtained comprehensive importance scores; inputting the obtained second-screened features into the target base model; concatenating the output traffic category probability vectors; inputting the obtained meta-probability features into a first target meta-classifier; and inputting the obtained probability vectors into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow. This method achieves malicious traffic classification at low cost and high efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of network security technology, and in particular to a method, apparatus, device, and storage medium for classifying network traffic based on multi-dimensional flow-level characteristics. Background Technology

[0002] With the continuous evolution of network infrastructure and the widespread deployment of encrypted communication technologies, network attacks are exhibiting highly organized, covert, and persistent characteristics. Against this backdrop, malicious traffic detection, as the first line of defense in proactive network defense, is crucial for ensuring the security of critical information infrastructure. Current mainstream malicious traffic detection methods are mainly based on three technical routes: rule matching, machine learning, and deep learning. Rule matching methods rely on precise descriptions of known attack characteristics, and their detection capabilities are inherently limited by the completeness of predefined rules. They cannot effectively identify new attack variants and unknown threats, and the rule base requires continuous manual maintenance and updates, lacking adaptability in the face of highly dynamic attack methods. While machine learning methods based on traditional feature engineering can perform classification detection through feature extraction, their construction process has inherent limitations. They typically rely heavily on the prior knowledge of domain experts for manual feature design. This process is not only highly subjective and costly in terms of manpower and time, but also makes it difficult to guarantee the optimality of the feature set. Furthermore, manually constructed features often contain redundant and noisy components, increasing model complexity and diluting the discriminative power of effective features, leading to decreased model training efficiency and impaired generalization performance. Furthermore, at the model architecture design level, a single classifier often struggles to fully capture the inherent complex distribution and dynamic changes in network traffic data. Its limited representational capabilities and relatively fixed decision boundaries make it unable to effectively adapt to the needs of discriminating diverse attack behaviors, resulting in a difficulty in achieving both high-precision classification and high robustness in real-world network environments.

[0003] At the feature representation level, existing methods mostly focus on traffic statistics or packet-level features, lacking a systematic integration of multi-dimensional features, which limits the ability to characterize complex attack behaviors. In addition, traditional feature selection methods are usually separated from classification models, relying solely on statistical indicators or heuristic rules. This makes it difficult to optimize in conjunction with model structure, and also fails to effectively evaluate the generalization ability and effectiveness of features across different models.

[0004] As can be seen from the above, how to achieve malicious traffic classification in a low-cost and efficient manner is an urgent problem to be solved. Summary of the Invention

[0005] In view of this, the purpose of this invention is to provide a network traffic classification method, apparatus, device, and storage medium based on multi-dimensional flow-level characteristics, which can achieve malicious traffic classification at low cost and high efficiency. The specific solution is as follows:

[0006] Firstly, this application provides a network traffic classification method based on multi-dimensional flow-level features, including:

[0007] Raw data packets are captured from the target network environment. The raw data packets are grouped based on a 5-tuple and a preset time interval threshold. The target network flow is determined using the grouped data packets. Multiple dimensions of target flow-level features are extracted from each target network flow. A feature vector is constructed based on the target flow-level features. The target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features.

[0008] The feature vectors are initially screened using the maximum relevance minimum redundancy algorithm to obtain first-screened features. The first-screened features are then evaluated using a target base model to obtain comprehensive importance scores for each feature. These comprehensive importance scores are then used to rank the features to obtain second-screened features. The second-screened features are then input into the target base model to obtain probability vectors for each traffic category. These probability vectors are then concatenated to obtain meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost.

[0009] The meta-probability features are input into a first target meta-classifier to obtain probability vectors, and the probability vectors are input into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

[0010] Optionally, the step of capturing raw data packets from the target network environment, grouping the raw data packets based on a 5-tuple and a preset time interval threshold, and using the grouped data packets to determine the target network flow includes:

[0011] Unprocessed network communication data packets are captured from the target network environment to obtain raw data packets. The raw data packets are then grouped based on their source IP address, destination IP address, source port, destination port, and protocol type to obtain an initial network flow.

[0012] If the arrival time between two adjacent raw data packets in the initial network flow exceeds a preset time interval threshold, it indicates that the two adjacent raw data packets do not belong to the same initial network flow, and the two adjacent raw data packets are grouped to obtain the target network flow; the preset time interval threshold is a time interval threshold determined based on preset industry standards and actual traffic patterns.

[0013] Optionally, the step of extracting multiple dimensions of target flow-level features from each of the target network flows and constructing a feature vector based on the target flow-level features includes:

[0014] The target network flow is extracted to obtain the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow. Static attribute features are constructed based on the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow.

[0015] The target network flow is extracted to obtain the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets. Based on the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets, a time-series behavioral feature is constructed. The variation pattern of the original data packets includes the statistical characteristics of the size sequence of the original data packets, the burst traffic index corresponding to the original data packets, and the initial transmission behavior feature corresponding to the original data packets.

[0016] The target network flow is extracted to obtain the TCP flag distribution, time-to-live (TTL) changes, and connection state transition data corresponding to the original data packets. Based on the TCP flag distribution, TTL changes, and connection state transition data corresponding to the original data packets, protocol interaction features are constructed to describe the transport protocol corresponding to the original data packets.

[0017] Based on a preset sliding time window, the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet are statistically analyzed. Network flow context features are then constructed using the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet.

[0018] The static attribute features, temporal behavior features, protocol interaction features, and network flow context features are arranged in a preset order to construct a feature vector for each target dimension corresponding to the target network flow.

[0019] Optionally, the preliminary screening of the feature vectors based on the maximum correlation minimum redundancy algorithm to obtain the first screened features includes:

[0020] Initialize the selected features in the selected feature set, and determine candidate features based on the selected feature set and the feature vector. Use the maximum correlation minimum redundancy algorithm to determine the mutual information between each candidate feature and the preset traffic category label.

[0021] The redundancy of the candidate features and the selected features is determined. Based on the maximum value of the difference between the mutual information and the redundancy, a new selected feature is determined in the selected feature set. Then, the process jumps to the step of determining candidate features based on the selected feature set and the feature vector, until the selected features in the selected feature set reach a first target number, so as to obtain the first filtered features.

[0022] Optionally, the step of evaluating the first filtered features using the target basis model to obtain a comprehensive importance score for each of the first filtered features, and ranking them based on the comprehensive importance scores to obtain the second filtered features, includes:

[0023] The first filtered features are respectively input into Random Forest, Extremely Random Tree, Lightweight Gradient Boosting and XGBoost to obtain the target feature importance score corresponding to each first filtered feature;

[0024] Based on the importance scores of each feature in the target feature importance scores, determine the importance ranking of each feature after the first screening, and determine the comprehensive importance score by the average value of each importance ranking of the feature after the first screening.

[0025] The first filtered features are sorted in descending order using the comprehensive importance score to obtain sorted features, and the second filtered features for the second target quantity are determined based on the sorted features.

[0026] Optionally, the step of inputting the meta-probability features into a first target meta-classifier to obtain probability vectors, and inputting each probability vector into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow, includes:

[0027] The support vector machine and logistic regression in the first target meta-classifier are used to capture the linear relationship in the meta-probability features to obtain the first probability vector and the second probability vector.

[0028] The third probability vector is obtained by capturing the nonlinear relationship in the meta-probability features based on the multilayer perceptron in the first target meta-classifier;

[0029] The first probability vector, the second probability vector, and the third probability vector are concatenated to obtain a concatenated vector. Based on the concatenated vector and using a second target meta-classifier, the target traffic category corresponding to the target network flow is determined.

[0030] Optionally, the network traffic classification method based on multi-dimensional flow-level features further includes:

[0031] Using a target genetic algorithm, all hyperparameter combinations in the target base model, the first target meta-classifier, and the second target meta-classifier are divided into different non-dominated front layers, and the crowding distance corresponding to the hyperparameter combinations in the non-dominated front layer is determined.

[0032] Based on the crowding distance and using an elite retention strategy and a tournament selection mechanism, a target hyperparameter combination is determined, and the target base model, the first target meta-classifier, and the second target meta-classifier are updated and optimized using the target hyperparameter combination.

[0033] Secondly, this application provides a network traffic classification device based on multi-dimensional flow-level features, comprising:

[0034] The feature extraction module is used to capture raw data packets from the target network environment, group the raw data packets based on a 5-tuple and a preset time interval threshold, determine the target network flow using the grouped data packets, extract target flow-level features of multiple dimensions from each target network flow, and construct feature vectors based on the target flow-level features; the target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features;

[0035] The feature filtering module is used to initially filter the feature vectors based on the maximum relevance and minimum redundancy algorithm to obtain first-filtered features. The first-filtered features are then evaluated using a target base model to obtain comprehensive importance scores for each feature. These comprehensive importance scores are then used to rank the features to obtain second-filtered features. The second-filtered features are then input into the target base model to obtain probability vectors for each traffic category. These traffic category probability vectors are then concatenated to obtain meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost.

[0036] The traffic classification module is used to input the meta-probability features into a first target meta-classifier to obtain each probability vector, and input each probability vector into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

[0037] Thirdly, this application provides an electronic device, comprising:

[0038] Memory, used to store computer programs;

[0039] A processor is used to execute the computer program to implement the aforementioned network traffic classification method based on multi-dimensional flow-level features.

[0040] Fourthly, this application provides a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the aforementioned network traffic classification method based on multi-dimensional flow-level features.

[0041] This application captures raw data packets from a target network environment, groups the raw data packets based on a 5-tuple and a preset time interval threshold, determines the target network flow using the grouped data packets, extracts multiple dimensions of target flow-level features from each target network flow, and constructs feature vectors based on the target flow-level features. The target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features. The feature vectors are initially screened using a maximum relevance minimum redundancy algorithm to obtain first-screened features. The first-screened features are then evaluated using a target basis model to obtain comprehensive importance scores for each of the first-screened features. Importance scores are sorted to obtain second-filtered features. These second-filtered features are then input into the target base model to obtain probability vectors for each traffic category. The probability vectors for each traffic category are concatenated to obtain meta-probability features. The target base model includes random forest, extreme random tree, lightweight gradient boosting, and XGBoost. The meta-probability features are input into a first target meta-classifier to obtain probability vectors. These probability vectors are then input into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow. The first target meta-classifier includes support vector machine, logistic regression, and multilayer perceptron. The second target meta-classifier includes gradient boosting decision tree.

[0042] As can be seen from the above, this application utilizes quintuple information and a preset time threshold to group the original data packets and determine the target network flow, ensuring the integrity of each network flow. Then, it extracts multi-dimensional flow-level features of the target network flow, overcoming the limitations of single-dimensional traffic classification. Next, it uses the maximum correlation minimum redundancy algorithm to initially screen the feature vectors, and then uses four target basis models for further feature screening, optimizing feature quality and effectively eliminating redundant features. Finally, based on the screened features and using the target basis models, it obtains the output traffic category probability vector, thus yielding the meta-probability features. In this way, the traffic classification results based on the meta-probability features and using the first and second target meta-classifiers can not only distinguish between normal and malicious traffic, but also achieve precise segmentation of different types of malicious traffic, effectively improving the accuracy of distinguishing between normal traffic and various types of malicious traffic. It can accurately identify encrypted malicious traffic in complex network environments, significantly improving the accuracy and efficiency of traffic classification. Attached Figure Description

[0043] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0044] Figure 1 This is a flowchart of a network traffic classification method based on multi-dimensional flow-level features disclosed in this application;

[0045] Figure 2 A schematic diagram of a target flow level characteristic provided in this application;

[0046] Figure 3 This application discloses a specific network traffic classification method based on multi-dimensional flow-level features. (Flowchart)

[0047] Figure 4 A schematic diagram of output vector construction provided in this application;

[0048] Figure 5 This application provides a schematic diagram of a meta-probabilistic feature construction.

[0049] Figure 6 A traffic classification diagram provided for this application;

[0050] Figure 7 This is a schematic diagram of a network traffic classification device based on multi-dimensional flow-level features disclosed in this application.

[0051] Figure 8 This is a structural diagram of an electronic device disclosed in this application. Detailed Implementation

[0052] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0053] Currently, mainstream malicious traffic detection methods cannot effectively identify new attack variants and unknown threats. Furthermore, rule bases require continuous manual maintenance and updates, lacking adaptability to highly dynamic attack methods. They also suffer from strong subjectivity, consuming significant manpower and time, making it difficult to guarantee the optimality of feature sets. Manually constructed features often contain redundancy and noise, increasing model complexity and diluting the discriminative power of effective features, leading to decreased model training efficiency and impaired generalization performance. Additionally, at the model architecture design level, single classifiers often fail to fully capture the inherent complex distribution and dynamic changes in network traffic data. Their limited representational capabilities and relatively fixed decision boundaries make them unable to effectively adapt to the discrimination needs of diverse attack behaviors, resulting in a difficulty in achieving both high-precision classification and high robustness in real-world network environments. To this end, this application provides a network traffic classification method based on multi-dimensional flow-level features. Based on meta-probability features and using the traffic classification results obtained by the first target meta-classifier and the second target meta-classifier, it can not only distinguish between normal traffic and malicious traffic, but also achieve accurate subdivision of different types of malicious traffic, effectively improving the accuracy of distinguishing between normal traffic and various types of malicious traffic. It can accurately identify encrypted malicious traffic in complex network environments, and significantly improve the accuracy and efficiency of traffic classification.

[0054] See Figure 1 As shown, this embodiment of the invention discloses a network traffic classification method based on multi-dimensional flow-level features, including:

[0055] Step S11: Capture raw data packets from the target network environment, group the raw data packets based on the five-tuple and a preset time interval threshold, determine the target network flow using the grouped data packets, extract target flow-level features of multiple dimensions in each target network flow, and construct a feature vector based on the target flow-level features; the target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features.

[0056] In this embodiment, unprocessed raw data packets in the target network environment are acquired. These raw data packets are then grouped based on their source IP address, destination IP address, source port, destination port, and protocol type to ensure that data packets interacting within the same network are grouped together, thus obtaining an initial network flow. The source IP address is the IP address of the sending device that sends the raw data packets; the destination IP address is the IP address of the receiving device that receives the raw data packets; the source port is the communication port number corresponding to the sending device; and the protocol type includes transport layer protocols such as TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). In one specific implementation, if the arrival interval between two adjacent raw data packets in the initial network flow exceeds 60 seconds, it indicates that the two adjacent raw data packets do not belong to the same initial network flow. That is, the preceding raw data packet marks the end of a network flow, and the following raw data packet marks the beginning of a new network flow, thereby obtaining the target network flow. The preset time interval threshold needs to meet preset industry standard conditions, such as RFC (Request For Comments) related recommendations, NetFlow (a network monitoring function) and sFlow (a network monitoring technology) session timeout conventions; and the time interval threshold is obtained by analyzing the behavior patterns of historical network flows.

[0057] Specifically, the step of capturing raw data packets from the target network environment, grouping the raw data packets based on a 5-tuple and a preset time interval threshold, and determining the target network flow using the grouped data packets includes: capturing unprocessed network communication data packets from the target network environment to obtain raw data packets; grouping the raw data packets based on the source IP address, destination IP address, source port, destination port, and protocol type of the raw data packets to obtain an initial network flow; if the arrival interval between two adjacent raw data packets in the initial network flow exceeds the preset time interval threshold, it indicates that the two adjacent raw data packets do not belong to the same initial network flow, and the two adjacent raw data packets are grouped to obtain the target network flow; the preset time interval threshold is a time interval threshold determined based on preset industry standards and actual traffic patterns.

[0058] It is understandable that after obtaining the target network flow, multiple dimensions of target flow-level features are extracted from each target network flow. Figure 2This embodiment provides a schematic diagram of target flow-level features. The target flow-level features include static attribute features (i.e., basic statistical features), temporal behavior features, protocol interaction features, and network flow context features (i.e., connection context features), and a 58-dimensional feature vector is constructed based on these features. Specifically, extracting multiple dimensions of target flow-level features from each target network flow and constructing a feature vector based on these features includes: extracting the target network flow to obtain the total number of data packets, total number of bytes, duration, and upload / download traffic ratio corresponding to the target network flow, and constructing static attribute features based on the total number of data packets, total number of bytes, duration, and upload / download traffic ratio corresponding to the target network flow; extracting the target network flow to obtain the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets, and constructing temporal behavior features based on the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets; the variation pattern of the original data packets includes the statistical characteristics of the size sequence of the original data packets, the burst traffic index corresponding to the original data packets, and the initial transmission behavior features corresponding to the original data packets; extracting the target network flow to obtain... The distribution of TCP flag bits, changes in time-to-live (TTL), and connection state transition data corresponding to the original data packets are analyzed. Based on this data, protocol interaction features describing the transport protocol corresponding to the original data packets are constructed. A preset sliding time window is used to statistically analyze the number of times the source IP address of the original data packets of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical frequency of the five-tuple in the original data packets. Network flow context features are then constructed using these parameters. Finally, the static attribute features, temporal behavior features, protocol interaction features, and network flow context features are arranged in a preset order to construct a feature vector for each target dimension of the target network flow.

[0059] Step S12: The feature vectors are initially screened based on the maximum relevance minimum redundancy algorithm to obtain the first screened features. The first screened features are evaluated using the target base model to obtain the comprehensive importance scores of each feature. The features are then sorted based on the comprehensive importance scores to obtain the second screened features. The second screened features are input into the target base model to obtain the probability vectors of each traffic category. The probability vectors of each traffic category are concatenated to obtain the meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost.

[0060] In this embodiment, the mutual information between the feature vector and the preset traffic category label is determined. The preset traffic category label includes normal traffic and malicious traffic. Malicious traffic includes types such as DDoS attacks, botnet traffic, phishing traffic, and malware traffic. The formula corresponding to the mutual information is as follows:

[0061] ;

[0062] in, The feature vector; The preset traffic category label; For feature taking The marginal probability; For feature taking And traffic category tags are taken The joint probability. In one specific implementation, the selected feature set S is initialized as an empty set, therefore the redundancy is 0. The feature with the largest mutual information between the feature vector and the preset traffic category label is selected as the selected feature, and the selected feature is added to the selected feature set. Candidate features are features in the feature vector other than the selected features in the selected feature set. The mutual information between the candidate feature and the selected feature is determined based on the above formula for mutual information, and the redundancy between the candidate feature and the selected feature is determined. The formula corresponding to the redundancy is as follows:

[0063] ;

[0064] in, The i-th selected feature in the selected feature set; These are candidate features to be evaluated; The selected feature set; The mutual information between the selected features and the candidate features; The number of features corresponding to the selected feature set; new selected features in the selected feature set are determined based on the difference between the mutual information and the redundancy, and the corresponding formula is as follows:

[0065] ;

[0066] Where F is the feature vector; The i-th selected feature in the selected feature set; These are candidate features to be evaluated; The selected feature set; The mutual information between the selected features and the candidate features is used. Specifically, the maximum difference between the mutual information and the redundancy is selected as the new selected feature in the selected feature set, and the process proceeds to the step of determining candidate features based on the selected feature set and the feature vector, until the number of selected features in the selected feature set reaches 35, thus obtaining the first filtered features.

[0067] Specifically, the preliminary screening of the feature vectors based on the maximum correlation minimum redundancy algorithm to obtain the first filtered features includes: initializing the selected features in the selected feature set, determining candidate features based on the selected feature set and the feature vectors, determining the mutual information between each candidate feature and a preset traffic category label using the maximum correlation minimum redundancy algorithm; determining the redundancy of the candidate features and the selected features, determining a new selected feature in the selected feature set based on the maximum difference between the mutual information and the redundancy, and then proceeding to the step of determining candidate features based on the selected feature set and the feature vectors, until the number of selected features in the selected feature set reaches a first target number, thereby obtaining the first filtered features. It is worth noting that the first target number can be adjusted according to actual circumstances and is not specifically limited here.

[0068] Understandably, after obtaining the first filtered features, these features are input into Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost respectively to obtain the feature importance score corresponding to each first filtered feature; that is, a first filtered feature has four feature importance scores, and the comprehensive importance score is determined based on the average of the feature importance scores; the corresponding formula is as follows:

[0069] ;

[0070] Where j is the j-th feature after the first screening; i is the i-th base model; The importance ranking position of the i-th base model for the j-th first-selected feature is determined; 25 first-selected features with high comprehensive importance scores are selected as second-selected features; if there are first-selected features with the same importance scores, the first-selected feature with the smaller standard deviation is preferentially selected as the second-selected feature.

[0071] Specifically, the step of evaluating the first filtered features using a target base model to obtain a comprehensive importance score for each of the first filtered features, and then ranking them based on the comprehensive importance scores to obtain the second filtered features, includes: inputting the first filtered features into a random forest, an extreme random tree, a lightweight gradient boosting, and an XGBoost to obtain a target feature importance score for each of the first filtered features; determining the importance ranking of each of the first filtered features based on the feature importance scores, and determining the comprehensive importance score by averaging the values ​​of the importance rankings; sorting the first filtered features in descending order using the comprehensive importance scores to obtain the ranked features, and determining the second target number of the second filtered features based on the ranked features. It is worth noting that the second target number can be adjusted according to actual circumstances and is not specifically limited here.

[0072] Furthermore, the second filtered features are input into the target base model to obtain a 4×C dimensional probability vector for each traffic category, where C is the total number of traffic category labels; the corresponding formula is as follows:

[0073] ;

[0074] Where C represents the total number of traffic category labels; T is the transpose; and i is the i-th base model. After obtaining the traffic category probabilities, the probability vectors of each traffic category are concatenated to obtain a set of meta-probability features including each meta-probability feature, as shown in the following formula:

[0075] ;

[0076] in, The set of meta-probability features; This is the traffic category probability vector output by the i-th base model; This involves concatenating vectors. Predict the probability that the second filtered feature belongs to the j-th traffic category for the i-th base model; T is the transpose.

[0077] Step S13: Input the meta-probability features into the first target meta-classifier to obtain each probability vector, and input each probability vector into the second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

[0078] In this embodiment, the meta-probability features are input to a first target meta-classifier to obtain a C-dimensional probability vector. The probability vectors of the three classifiers in the first target classifier are concatenated sequentially to obtain a 3×C-dimensional concatenated vector. Based on the concatenated vector and a second target meta-classifier, the target traffic category corresponding to the target network flow is determined. Specifically, inputting the meta-probability features to the first target meta-classifier to obtain probability vectors, and inputting each probability vector to the second target meta-classifier to obtain the traffic classification result corresponding to the target network flow, includes: using support vector machine and logistic regression in the first target meta-classifier to capture the linear relationship in the meta-probability features to obtain a first probability vector and a second probability vector; capturing the non-linear relationship in the meta-probability features based on multilayer perceptron in the first target meta-classifier to obtain a third probability vector; concatenating the first probability vector, the second probability vector, and the third probability vector to obtain a concatenated vector; and determining the target traffic category corresponding to the target network flow based on the concatenated vector and the second target meta-classifier.

[0079] Understandably, the non-dominated sorting genetic algorithm is used to perform collaborative global optimization of all hyperparameters in the target base model, the first target meta-classifier, and the second target meta-classifier, aiming to simultaneously maximize classification accuracy and minimize the variance of the macro F1 score (averaged over the F1 scores of each class). The corresponding formula is as follows:

[0080] ;

[0081] in, This is the set of all hyperparameters that need to be optimized. For hyperparameters The corresponding classification accuracy at that time; For hyperparameters The variance of the corresponding macro F1 score is given. A balance between classification accuracy and macro F1 score is achieved using the Pareto optimal solution set of the non-dominated sorting genetic algorithm. The formula corresponding to the Pareto optimal solution set is as follows:

[0082] ;

[0083] in, This represents all possible search spaces corresponding to the hyperparameters, i.e., the range of values ​​that the hyperparameters can take. In the search space, except Other hyperparameter configuration combinations besides those; If there exists a Its classification accuracy is no less than And the variance of the macro F1 score is not higher than ,but It does not belong to the Pareto optimal solution set; conversely, if the above conditions do not exist, then... ,but This belongs to the Pareto optimal solution set. To ensure a uniform distribution of hyperparameter combinations within the Pareto optimal solution set, the crowding distance for each hyperparameter combination is determined. The formula for the crowding distance is as follows:

[0084] ;

[0085] in, The number of optimization functions is 2, and there are two optimization objectives: classification accuracy and macro F1 score. , To optimize m objective functions and combine them with the current hyperparameters The objective function value corresponding to two adjacent hyperparameter combinations; , Let m be the maximum and minimum values ​​of the m-th objective optimization function among all hyperparameter combinations. After obtaining the crowding distance, a set of co-optimal objective hyperparameter combinations is determined through an elite retention strategy and a tournament selection mechanism, and these combinations are used to update and optimize the objective base model, the first objective meta-classifier, and the second objective meta-classifier.

[0086] Specifically, the network traffic classification method based on multi-dimensional flow-level features further includes: using a target genetic algorithm to determine that all hyperparameter combinations in the target base model, the first target meta-classifier, and the second target meta-classifier are divided into different non-dominated front layers, and determining the congestion distance corresponding to the hyperparameter combinations in the non-dominated front layers; determining target hyperparameter combinations based on the congestion distance and using an elite retention strategy and a tournament selection mechanism, so as to update and optimize the target base model, the first target meta-classifier, and the second target meta-classifier using the target hyperparameter combinations.

[0087] As can be seen from the above, this application utilizes quintuple information and a preset time threshold to group the original data packets and determine the target network flow, ensuring the integrity of each network flow. Then, it extracts multi-dimensional flow-level features of the target network flow, overcoming the limitations of single-dimensional traffic classification. Next, it uses the maximum correlation minimum redundancy algorithm to initially screen the feature vectors, and then uses four target basis models for further feature screening, optimizing feature quality and effectively eliminating redundant features. Finally, based on the screened features and using the target basis models, it obtains the output traffic category probability vector, thus yielding the meta-probability features. In this way, the traffic classification results based on the meta-probability features and using the first and second target meta-classifiers can not only distinguish between normal and malicious traffic, but also achieve precise segmentation of different types of malicious traffic, effectively improving the accuracy of distinguishing between normal traffic and various types of malicious traffic. It can accurately identify encrypted malicious traffic in complex network environments, significantly improving the accuracy and efficiency of traffic classification.

[0088] As can be seen from the above embodiments, this application determines the traffic classification result based on meta-probability features and using a first target meta-classifier and a second target meta-classifier. Therefore, the process of determining the traffic classification result based on meta-probability features and using a first target meta-classifier and a second target meta-classifier is described.

[0089] See Figure 3 As shown, this embodiment of the invention discloses a specific network traffic classification method based on multi-dimensional flow-level features, including:

[0090] In this embodiment, Figure 4 This embodiment provides a schematic diagram of output vector construction. Unprocessed raw data packets from the target network environment are acquired. These raw data packets are grouped based on their source IP address, destination IP address, source port, destination port, and protocol type, grouping data packets from the same network interaction to obtain an initial network flow. If the arrival interval between two adjacent raw data packets in the initial network flow exceeds a preset time interval threshold, it indicates that the two adjacent raw data packets do not belong to the same initial network flow. The adjacent raw data packets are then grouped to obtain the target network flow. Multiple dimensions of target flow-level features are extracted from each target network flow. These features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features. A feature vector is constructed based on these target flow-level features.

[0091] Understandable, Figure 5This embodiment provides a schematic diagram of a meta-probabilistic feature construction. After obtaining the feature vector, the feature vector is initially screened using the Minimum Redundancy Maximum Relevance (MRMR) algorithm to obtain a 35-dimensional first-stage filtered feature. The first-stage filtered feature is then evaluated using a target base model to obtain a comprehensive importance score for each feature. The comprehensive importance score is determined based on the mean of the feature importance scores. The first-stage filtered feature with the highest comprehensive importance score is selected as the second-stage filtered feature. If multiple first-stage filtered features have the same comprehensive importance score, the standard deviation of the first-stage filtered feature in the ranking of the four base models is determined, and the first-stage filtered feature with the smaller standard deviation is preferentially selected as the second-stage filtered feature. The second-stage filtered feature is input into the target base model to obtain the output probability vectors for each traffic category. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost. The probability vectors of each traffic category are concatenated to obtain a meta-probabilistic feature set including each meta-probabilistic feature.

[0092] Furthermore, Figure 6 This embodiment provides a traffic classification diagram. After obtaining the meta-probability feature set, the meta-probability features are input into a support vector machine, logistic regression, and a multilayer perceptron to obtain probability vectors. The probability vectors are then concatenated in sequence to obtain a concatenated vector. Based on the concatenated vector and using a gradient boosting decision tree (GBDT), the target traffic category corresponding to the target network flow is determined. The target traffic category includes normal traffic and various malicious traffic.

[0093] As shown above, this application first groups the original data packets to obtain the target network flow, ensuring the integrity of each network flow. It then extracts features from the target network flow to obtain multi-dimensional target flow-level features, constructing feature vectors. The feature vectors are initially screened using the maximum correlation minimum redundancy algorithm, and then further screened using four target basis models to effectively eliminate redundant features. Based on the screened features, the traffic category probability vector is obtained using the target basis models, thus yielding meta-probability features. In this way, based on the meta-probability features and using two target meta-classifiers, the final traffic classification result is determined, enabling the identification and differentiation of normal and malicious traffic. It also allows for precise segmentation of different types of malicious traffic, accurately identifying encrypted traffic attacks in complex network environments, and ensuring network security.

[0094] Accordingly, see Figure 7As shown, this application also provides a network traffic classification device based on multi-dimensional flow-level features, comprising:

[0095] The feature extraction module 11 is used to capture raw data packets from the target network environment, group the raw data packets based on a 5-tuple and a preset time interval threshold, determine the target network flow using the grouped data packets, extract multiple dimensions of target flow-level features from each target network flow, and construct a feature vector based on the target flow-level features; the target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features;

[0096] The feature filtering module 12 is used to initially filter the feature vectors based on the maximum relevance and minimum redundancy algorithm to obtain first-filtered features. The first-filtered features are then evaluated using a target base model to obtain comprehensive importance scores for each feature. These comprehensive importance scores are then used to rank the features to obtain second-filtered features. The second-filtered features are then input into the target base model to obtain probability vectors for each traffic category. These traffic category probability vectors are then concatenated to obtain meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost.

[0097] The traffic classification module 13 is used to input the meta-probability features into a first target meta-classifier to obtain each probability vector, and input each probability vector into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

[0098] In some specific embodiments, the feature extraction module 11 may specifically include:

[0099] The data packet grouping unit is used to capture unprocessed network communication data packets from the target network environment to obtain raw data packets, and to group the raw data packets based on the source IP address, destination IP address, source port, destination port and protocol type to obtain an initial network flow;

[0100] The target network flow determination unit is used to indicate that the two adjacent raw data packets do not belong to the same initial network flow if the arrival time between two adjacent raw data packets in the initial network flow exceeds a preset time interval threshold, and to group the two adjacent raw data packets to obtain the target network flow; the preset time interval threshold is a time interval threshold determined based on preset industry standards and actual traffic patterns.

[0101] In some specific embodiments, the feature extraction module 11 may specifically include:

[0102] The first feature construction unit is used to extract the target network flow to obtain the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow, and to construct static attribute features based on the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow.

[0103] The second feature construction unit is used to extract the target network flow to obtain the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets, and to construct time-series behavioral features based on the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets; the variation pattern of the original data packets includes the statistical characteristics of the size sequence of the original data packets, the burst traffic index corresponding to the original data packets, and the initial transmission behavior features corresponding to the original data packets;

[0104] The third feature construction unit is used to extract the target network flow to obtain the TCP flag distribution, time-to-live change and connection state transition data corresponding to the original data packet, and to construct protocol interaction features to describe the transmission protocol corresponding to the original data packet based on the TCP flag distribution, time-to-live change and connection state transition data corresponding to the original data packet.

[0105] The fourth feature construction unit is used to count the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet based on a preset sliding time window, and to construct network flow context features using the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet;

[0106] The feature vector construction unit is used to arrange the static attribute features, the temporal behavior features, the protocol interaction features and the network flow context features in a preset order to construct the feature vector of the target dimension corresponding to each target network flow.

[0107] In some specific embodiments, the feature filtering module 12 may specifically include:

[0108] The mutual information determination unit is used to initialize the selected features in the selected feature set, determine candidate features based on the selected feature set and the feature vector, and determine the mutual information between each candidate feature and the preset traffic category label using the maximum correlation minimum redundancy algorithm.

[0109] The first feature determination unit is used to determine the redundancy of the candidate features and the selected features, determine a new selected feature in the selected feature set based on the maximum value of the difference between the mutual information and the redundancy, and jump to the step of determining candidate features based on the selected feature set and the feature vector, until the selected features in the selected feature set reach a first target number, so as to obtain the first filtered features.

[0110] In some specific embodiments, the feature filtering module 12 may specifically include:

[0111] The target score determination unit is used to input the first filtered features into random forest, extreme random tree, lightweight gradient boosting and XGBoost respectively to obtain the target feature importance score corresponding to each first filtered feature;

[0112] The comprehensive score determination unit is used to determine the importance ranking of each feature after the first screening based on the importance scores of each feature in the target feature importance score, and to determine the comprehensive importance score by the average value of each importance ranking of the first screening feature.

[0113] The feature sorting unit is used to sort the first filtered features in descending order using the comprehensive importance score to obtain sorted features, and to determine the second filtered features with a second target quantity based on the sorted features.

[0114] In some specific embodiments, the traffic classification module 13 may specifically include:

[0115] The linear relationship capture unit is used to capture the linear relationship in the meta-probability features by using the support vector machine and logistic regression in the first target meta-classifier, respectively, to obtain the first probability vector and the second probability vector.

[0116] A nonlinear relationship capture unit is used to capture the nonlinear relationship in the meta-probability features based on the multilayer perceptron in the first target meta-classifier to obtain a third probability vector;

[0117] The vector concatenation unit is used to concatenate the first probability vector, the second probability vector, and the third probability vector to obtain a concatenated vector, and to determine the target traffic category corresponding to the target network flow based on the concatenated vector and using a second target meta-classifier.

[0118] In some specific embodiments, the network traffic classification device based on multi-dimensional flow-level features may further include:

[0119] The distance determination unit is used to use a target genetic algorithm to determine that all hyperparameter combinations in the target base model, the first target meta-classifier, and the second target meta-classifier are divided into different non-dominated front layers, and to determine the crowding distance corresponding to the hyperparameter combinations in the non-dominated front layer.

[0120] The hyperparameter optimization unit is used to determine a target hyperparameter combination based on the crowding distance and using an elite retention strategy and a tournament selection mechanism, so as to update and optimize the target base model, the first target meta-classifier and the second target meta-classifier using the target hyperparameter combination.

[0121] Furthermore, embodiments of this application also disclose an electronic device, Figure 8 This is a structural diagram of an electronic device 20 according to an exemplary embodiment. The content of the diagram should not be construed as limiting the scope of this application. Specifically, the electronic device 20 may include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input / output interface 25, and a communication bus 26. The memory 22 stores a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the network traffic classification method based on multi-dimensional flow-level features disclosed in any of the foregoing embodiments. Furthermore, the electronic device 20 in this embodiment may specifically be a computer.

[0122] In this embodiment, the power supply 23 is used to provide operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be any communication protocol applicable to the technical solution of this application, and is not specifically limited here; the input / output interface 25 is used to acquire external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs, and is not specifically limited here.

[0123] In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, random access memory, disk or optical disk, etc. The resources stored thereon can include operating system 221, computer program 222, etc., and the storage method can be temporary storage or permanent storage.

[0124] The operating system 221 is used to manage and control the various hardware devices on the electronic device 20 and the computer program 222, which may be Windows Server, Netware, Unix, Linux, etc. In addition to including a computer program capable of performing the network traffic classification method based on multi-dimensional flow-level characteristics executed by the electronic device 20 as disclosed in any of the foregoing embodiments, the computer program 222 may further include computer programs capable of performing other specific tasks.

[0125] Furthermore, this application also discloses a computer-readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, it implements the aforementioned network traffic classification method based on multi-dimensional flow-level features. Specific steps of this method can be found in the corresponding content disclosed in the foregoing embodiments, and will not be repeated here.

[0126] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to in the method section.

[0127] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0128] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.

[0129] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0130] The technical solutions provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A network traffic classification method based on multi-dimensional flow-level features, characterized in that, include: The original data packets are captured from the target network environment, and the original data packets are grouped based on the five-tuple and the preset time interval threshold. The target network flow is determined by the grouped data packets, and multiple dimensions of target flow-level features are extracted from each target network flow. A feature vector is constructed based on the target flow-level features. The target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features; The feature vectors are initially screened using the maximum relevance minimum redundancy algorithm to obtain first-screened features. The first-screened features are then evaluated using a target base model to obtain comprehensive importance scores for each feature. These comprehensive importance scores are then used to rank the features to obtain second-screened features. The second-screened features are then input into the target base model to obtain probability vectors for each traffic category. These probability vectors are then concatenated to obtain meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost. The meta-probability features are input into a first target meta-classifier to obtain probability vectors, and the probability vectors are input into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

2. The network traffic classification method based on multi-dimensional flow-level features according to claim 1, characterized in that, The process of capturing raw data packets from the target network environment, grouping the raw data packets based on a 5-tuple and a preset time interval threshold, and using the grouped data packets to determine the target network flow includes: Unprocessed network communication data packets are captured from the target network environment to obtain raw data packets. The raw data packets are then grouped based on their source IP address, destination IP address, source port, destination port, and protocol type to obtain an initial network flow. If the arrival time between two adjacent raw data packets in the initial network flow exceeds a preset time interval threshold, it indicates that the two adjacent raw data packets do not belong to the same initial network flow, and the two adjacent raw data packets are grouped to obtain the target network flow; the preset time interval threshold is a time interval threshold determined based on preset industry standards and actual traffic patterns.

3. The network traffic classification method based on multi-dimensional flow-level features according to claim 2, characterized in that, The step of extracting multiple dimensions of target flow-level features from each of the target network flows and constructing a feature vector based on the target flow-level features includes: The target network flow is extracted to obtain the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow. Static attribute features are constructed based on the total number of data packets, total number of bytes, duration, and upload and download traffic ratio corresponding to the target network flow. The target network flow is extracted to obtain the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets. Based on the arrival time interval of the original data packets in the target network flow and the variation pattern of the original data packets, a time-series behavioral feature is constructed. The variation pattern of the original data packets includes the statistical characteristics of the size sequence of the original data packets, the burst traffic index corresponding to the original data packets, and the initial transmission behavior feature corresponding to the original data packets. The target network flow is extracted to obtain the TCP flag distribution, time-to-live (TTL) changes, and connection state transition data corresponding to the original data packets. Based on the TCP flag distribution, TTL changes, and connection state transition data corresponding to the original data packets, protocol interaction features are constructed to describe the transport protocol corresponding to the original data packets. Based on a preset sliding time window, the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet are statistically analyzed. Network flow context features are then constructed using the number of times the source IP address of the original data packet of the target network flow initiates a complete network flow, the number of times the destination IP address receives a complete network flow, and the historical occurrence frequency of the five-tuple of the original data packet. The static attribute features, temporal behavior features, protocol interaction features, and network flow context features are arranged in a preset order to construct a feature vector for each target dimension corresponding to the target network flow.

4. The network traffic classification method based on multi-dimensional flow-level features according to claim 1, characterized in that, The preliminary screening of the feature vectors based on the maximum correlation minimum redundancy algorithm to obtain the first screened features includes: Initialize the selected features in the selected feature set, and determine candidate features based on the selected feature set and the feature vector. Use the maximum correlation minimum redundancy algorithm to determine the mutual information between each candidate feature and the preset traffic category label. The redundancy of the candidate features and the selected features is determined. Based on the maximum value of the difference between the mutual information and the redundancy, a new selected feature is determined in the selected feature set. Then, the process jumps to the step of determining candidate features based on the selected feature set and the feature vector, until the selected features in the selected feature set reach a first target number, so as to obtain the first filtered features.

5. The network traffic classification method based on multi-dimensional flow-level features according to claim 1, characterized in that, The process of evaluating the first selected features using a target basis model to obtain a comprehensive importance score for each of the first selected features, and then ranking them based on the comprehensive importance scores to obtain the second selected features, includes: The first filtered features are respectively input into Random Forest, Extremely Random Tree, Lightweight Gradient Boosting and XGBoost to obtain the target feature importance score corresponding to each first filtered feature; Based on the importance scores of each feature in the target feature importance scores, determine the importance ranking of each feature after the first screening, and determine the comprehensive importance score by the average value of each importance ranking of the feature after the first screening. The first filtered features are sorted in descending order using the comprehensive importance score to obtain sorted features, and the second filtered features for the second target quantity are determined based on the sorted features.

6. The network traffic classification method based on multi-dimensional flow-level features according to claim 1, characterized in that, The step of inputting the meta-probability features into a first target meta-classifier to obtain probability vectors, and inputting each probability vector into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow, includes: The support vector machine and logistic regression in the first target meta-classifier are used to capture the linear relationship in the meta-probability features to obtain the first probability vector and the second probability vector. The third probability vector is obtained by capturing the nonlinear relationship in the meta-probability features based on the multilayer perceptron in the first target meta-classifier; The first probability vector, the second probability vector, and the third probability vector are concatenated to obtain a concatenated vector. Based on the concatenated vector and using a second target meta-classifier, the target traffic category corresponding to the target network flow is determined.

7. The network traffic classification method based on multi-dimensional flow-level features according to claim 1, characterized in that, Also includes: Using a target genetic algorithm, all hyperparameter combinations in the target base model, the first target meta-classifier, and the second target meta-classifier are divided into different non-dominated front layers, and the crowding distance corresponding to the hyperparameter combinations in the non-dominated front layer is determined. Based on the crowding distance and using an elite retention strategy and a tournament selection mechanism, a target hyperparameter combination is determined, and the target base model, the first target meta-classifier, and the second target meta-classifier are updated and optimized using the target hyperparameter combination.

8. A network traffic classification device based on multi-dimensional flow-level features, characterized in that, include: The feature extraction module is used to capture raw data packets from the target network environment, group the raw data packets based on a 5-tuple and a preset time interval threshold, determine the target network flow using the grouped data packets, extract target flow-level features of multiple dimensions in each target network flow, and construct feature vectors based on the target flow-level features. The target flow-level features include static attribute features, temporal behavior features, protocol interaction features, and network flow context features; The feature filtering module is used to initially filter the feature vectors based on the maximum relevance and minimum redundancy algorithm to obtain first-filtered features. The first-filtered features are then evaluated using a target base model to obtain comprehensive importance scores for each feature. These comprehensive importance scores are then used to rank the features to obtain second-filtered features. The second-filtered features are then input into the target base model to obtain probability vectors for each traffic category. These traffic category probability vectors are then concatenated to obtain meta-probability features. The target base model includes Random Forest, Extremely Random Tree, Lightweight Gradient Boosting, and XGBoost. The traffic classification module is used to input the meta-probability features into a first target meta-classifier to obtain each probability vector, and input each probability vector into a second target meta-classifier to obtain the traffic classification result corresponding to the target network flow; the first target meta-classifier includes support vector machine, logistic regression and multilayer perceptron; the second target meta-classifier includes gradient boosting decision tree.

9. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor for executing the computer program to implement the network traffic classification method based on multi-dimensional flow-level features as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, Used to store a computer program, wherein the computer program, when executed by a processor, implements the network traffic classification method based on multi-dimensional flow-level features as described in any one of claims 1 to 7.