A cloud platform-based data flow monitoring method and system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By deploying traffic probes at the cloud platform entry point for in-depth analysis and using an improved isolated forest algorithm, combined with a pre-trained classification model, the problems of metadata extraction and abnormal traffic identification in cloud platform network traffic monitoring are solved. This enables accurate classification of abnormal traffic and dynamic deployment of defense rules, thereby improving the security and stability of network transmission.

CN122247769APending Publication Date: 2026-06-19陕西安康玮创达信息技术有限公司

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: 陕西安康玮创达信息技术有限公司
Filing Date: 2026-05-22
Publication Date: 2026-06-19

Application Information

Patent Timeline

22 May 2026

Application

19 Jun 2026

Publication

CN122247769A

IPC: H04L9/40; G06F18/10; G06F18/214; G06F18/24

AI Tagging

Application Domain

Securing communication

Technology Topics

Data pack Data set

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Method, apparatus, device, storage medium and program product for request processing
US20260169997A1Digital data information retrieval Special data processing applications Data pack Data mining
Method for operating a 3D printer and 3D printer for carrying out a method
WO2026125186A13D object support structures Manufacturing data aquisition/processing Data packComputer printing
Multi-task learning model, training method, electronic device and computer storage medium
CN121599045BBiological models Data pack Feature vector
Semi-parametric digital pre-distortion
US20260171994A1Amplifier with control circuitsMemory effect compensationData packSignal amplifier
Battery explosion-proof valve opening control method, device, equipment and computer storage medium
CN122246414ACell component details Secondary cells servicing/maintenance Data pack Electrical battery

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122247769A_ABST

Patent Text Reader

Abstract

This invention relates to the field of network traffic monitoring technology, specifically a data traffic monitoring method and system based on a cloud platform. The method includes: deploying traffic probes at the cloud platform's entry network node to collect raw traffic data packets; performing protocol identification and deep parsing on the data packets to extract multiple types of traffic metadata sets; aggregating metadata through time windows to generate traffic time series; analyzing the time series data using an improved isolated forest algorithm that dynamically adjusts isolated thresholds based on periodic traffic behavior patterns to identify abnormal traffic segments and feature vectors; inputting abnormal traffic information into a pre-trained classification model to determine the attack type; matching the corresponding set of defense rules from a protection strategy library; and dynamically distributing the rules to a traffic scrubbing device, which then performs subsequent traffic filtering and processing according to the defense rules.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of network traffic monitoring technology, and in particular to a data traffic monitoring method and system based on a cloud platform. Background Technology

[0002] The concurrent operation of multiple services on cloud platforms results in network traffic characterized by increased volume, complex behavior, and significant temporal fluctuations. Existing network traffic monitoring architectures often employ decentralized and independent traffic monitoring methods, only performing basic protocol identification on the collected raw traffic data packets. They lack the capability for deep packet analysis and cannot fully extract refined metadata such as source address, destination address, protocol type, and payload characteristics. Furthermore, the traffic monitoring process lacks a time window aggregation mechanism, making it impossible to organize discrete traffic metadata into a standardized traffic time series, thus hindering subsequent time-series anomaly analysis.

[0003] Traditional anomaly detection algorithms use fixed values for traffic threshold settings and do not dynamically adjust them based on the periodic changes in traffic behavior. The judgment criteria cannot adapt to the temporal fluctuations of daily business traffic on cloud platforms, making it difficult to accurately locate abnormal traffic segments hidden in normal traffic and to stably output corresponding feature vector information.

[0004] The existing protection system lacks a mechanism for automatically distinguishing abnormal traffic types, and cannot automatically categorize the corresponding network attack categories after anomaly signals are generated. There is no automated correlation and matching process between protection strategies and attack types, defense rules cannot be automatically pushed to cloud platform traffic scrubbing devices, and traffic scrubbing devices cannot perform targeted traffic filtering according to attack characteristics. The overall monitoring system can only detect anomalies and cannot form a complete closed loop from traffic collection, anomaly identification, type determination to rule distribution and traffic handling. It is necessary to build a fully automated cloud platform traffic monitoring and protection architecture. Summary of the Invention

[0005] The purpose of this invention is to address the shortcomings of existing technologies by proposing a data traffic monitoring method and system based on a cloud platform.

[0006] To achieve the above objectives, the present invention adopts the following technical solution: a data traffic monitoring method based on a cloud platform, comprising: Deploy traffic probes at the entry network nodes of the cloud platform to collect raw traffic data packets in real time; The original traffic data packets are subjected to protocol identification and deep parsing to extract a set of metadata containing source address, destination address, protocol type, and payload characteristics; The metadata set is aggregated based on a time window to generate a traffic time series; The improved isolated forest algorithm is invoked to analyze the traffic time series, and the improved isolated forest algorithm dynamically adjusts the isolation threshold based on the periodic pattern of traffic behavior; Based on the output of the improved isolated forest algorithm, abnormal traffic segments and their corresponding feature vectors are identified. The abnormal traffic fragments and their feature vectors are input into a pre-trained classification model to determine the attack type to which the abnormal traffic fragments belong; Based on the attack type, the corresponding set of defense rules is matched from the cloud platform's protection policy library; The set of defense rules is dynamically distributed to the traffic scrubbing equipment on the cloud platform; The flow cleaning device is controlled to filter and process subsequent flow according to the set of defense rules.

[0007] 2. Preferably, the original traffic data packets are subjected to protocol identification and deep parsing to extract a metadata set containing source address, destination address, protocol type, and payload characteristics, including: The original traffic data packets are decapsulated to remove the data link layer and network layer header information and obtain the transport layer protocol header. Parse the transport layer protocol header to determine the protocol type, source port, and destination port used for the connection, thus forming the basic information of the connection 5-tuple; For application layer protocols, pattern matching is performed on the payload based on the protocol feature library to identify the specific application protocol. From the identified application protocol payloads, predefined payload features are extracted, including request method, user agent, Uniform Resource Locator, and payload length statistical distribution. The basic information of the connection quintuple, the specific application protocol, and the extracted payload features are associated and integrated to form a structured metadata record; All the metadata records collected over a continuous period of time are sorted by timestamp to form the metadata set.

[0008] 3. Preferably, the step of aggregating the metadata set based on a time window to generate a traffic time series includes: Set a sliding time window of fixed or variable length; Within each sliding time window, multidimensional statistics are performed on the metadata records in the metadata set; The multidimensional statistics include: calculating connection request frequency by source address, calculating received connection frequency by destination address, calculating the traffic share of each protocol by protocol type, and calculating a histogram of feature value distribution by load characteristics. The multidimensional statistical results within each time window are expanded according to statistical dimensions and arranged in chronological order to form a multidimensional traffic time series. The traffic flow time series is standardized to eliminate the differences in dimensions between different statistical dimensions, generating a standardized traffic flow time series for subsequent analysis.

[0009] 4. Preferably, the step of calling the improved isolated forest algorithm to analyze the traffic time series includes: Receive the standardized traffic time series as input data; In the improved isolated forest algorithm, multiple binary isolation trees are constructed, and the construction of each tree is based on a recursive random partitioning of the input data space; During the recursive partitioning process, the partitioning features and the probability of selecting partitioning values are dynamically calculated based on the time periodicity intensity of the data points contained in the current node in the traffic time series. Based on the selection probability, a partitioning feature and a partitioning value are selected to divide the current node data into two child nodes; The partitioning process is executed recursively until a preset stopping condition is met. The stopping condition includes the tree reaching its maximum depth or a node containing only a single data point. Traverse all constructed binary isolation trees and calculate the average path length from the root to the corresponding leaf node for each data point; The average path length is compared with an isolation threshold dynamically calculated based on the time periodicity intensity. Data points with path lengths shorter than the isolation threshold are marked as outliers. Output all marked outliers and their corresponding time segments in the traffic time series as the outlier traffic segments.

[0010] 5. Preferably, the improved isolated forest algorithm dynamically adjusts the isolation threshold based on the periodic pattern of traffic behavior, and its working principle includes: During the model training phase, historical normal traffic time series are analyzed to detect and quantify their periodic patterns at different time scales, including days, hours, and minutes. For each time scale, the corresponding periodicity intensity coefficient is calculated, which reflects the degree to which the flow rate follows regular fluctuations over time. A periodic context vector is maintained for each internal node of each binary isolation tree. The periodic context vector is calculated based on the periodic phase of the timestamp of the data points contained in the node. When dividing nodes, the dynamic calculation of the selection probability of the division features and the division values is as follows: the division features are randomly selected from all features, but the selection range of the division values will be narrowed according to the expected normal flow range indicated by the periodic context vector, so that the division is more inclined to classify the points within the expected range into the same subtree. When determining an anomalies, the dynamically calculated isolation threshold is specifically as follows: for the data point to be tested, the expected baseline path length is obtained according to the periodic stage of its timestamp. The expected baseline path length is obtained by statistically analyzing the path lengths of normal data in the same historical stage. The expected baseline path length is multiplied by a coefficient adjusted according to the current periodic intensity coefficient and used as the dynamic isolation threshold of the data point to be tested.

[0011] 6. Preferably, the abnormal traffic segment and its feature vector are input into a pre-trained classification model to determine the attack type to which the abnormal traffic segment belongs, including: Extract the complete original traffic data packet corresponding to the abnormal traffic segment from the traffic probe; Session reassembly is performed on the complete original traffic data packets to restore the complete network session flow; Deep session features are extracted from the network session stream, including packet arrival time interval sequences, payload size variation patterns, and specific attack payload fingerprints. The feature vector corresponding to the abnormal traffic segment is combined with the extracted deep session features to form a comprehensive feature vector for this abnormal event; The comprehensive feature vector is input into the pre-trained classification model, which is a multi-classification model trained based on labeled historical attack traffic samples. The pre-trained classification model outputs the probability distribution of the comprehensive feature vector belonging to each preset attack type; The attack type with the highest probability value is determined as the attack type to which the abnormal traffic segment belongs.

[0012] 7. Preferably, based on the attack type, a corresponding set of defense rules is matched from the cloud platform's protection policy library, including: The cloud platform protection strategy library stores the mapping relationship between various attack types and defense rule templates; Based on the determined attack type, retrieve one or more corresponding defense rule templates; Extract key identification information from the abnormal traffic segments and the network session stream. The key identification information includes the malicious source address, the target port of the attack, and the attack payload signature. The key identification information is used as a parameter to fill the retrieved defense rule template, thereby generating specific and executable defense rules; The defense rules include: access control list rules, used to block traffic from a specific source address on a firewall or router; intrusion prevention system rules, used to match and drop packets containing specific attack signatures on deep inspection devices; and traffic shaping rules, used to limit the rate of traffic sent to a specific target port. All the defense rules generated for this abnormal event are combined to form the defense rule set.

[0013] 8. Preferably, the traffic scrubbing device that dynamically distributes the set of defense rules to the cloud platform includes: The set of defense rules is converted into a configuration instruction format that can be recognized by the target traffic scrubbing device in the cloud platform. The traffic scrubbing device includes a distributed firewall, a web application firewall, and an intrusion prevention system. The formatted configuration commands are asynchronously sent to the relevant traffic scrubbing devices through a unified configuration management channel on the cloud platform. Monitor the deployment status of the configuration instructions on the target traffic scrubbing device to ensure that the rules are successfully activated; After the rules are successfully issued, the global policy status table of the cloud platform is updated to record the effective time, scope of application, and expected validity period of the defense rule set.

[0014] 9. Preferably, controlling the flow cleaning device to filter and process subsequent flow according to the set of defense rules includes: The traffic scrubbing device performs real-time matching and checking of data packets flowing through the data forwarding path based on the set of effective defense rules. For packets that match the blocking rule, drop them and generate a security event log. For data streams that match the rate limiting rule, their traffic rate is constrained to below the threshold set by the rule; The security event logs and traffic handling statistics are reported to the cloud platform's monitoring center in real time. At the monitoring center, the reported handling information is correlated with the initially detected abnormal traffic segments to verify the effectiveness of the defense rules.

[0015] As a further aspect of the present invention, the present invention also includes a cloud platform-based data traffic monitoring system, the system including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein when the processor executes the computer program, it implements the steps of the cloud platform-based data traffic monitoring method described above.

[0016] Compared with the prior art, the advantages and positive effects of the present invention are as follows: This method collects raw traffic data packets from cloud platform entry network nodes, performs protocol identification and deep analysis, and extracts source address, destination address, protocol type, and payload characteristics to form a metadata set. The metadata set is then aggregated using a time window to generate a traffic time series, which fully preserves the voiceprint and behavioral characteristics of the traffic data, unifies the temporal structure of discrete traffic metadata, and standardizes the formal representation of multi-dimensional web traffic data. An improved isolated forest algorithm, which dynamically adjusts isolated thresholds based on the periodic patterns of traffic behavior, is used for time series analysis. This allows for flexible adaptation of threshold standards under normal network traffic conditions, moving away from the rigid logic of fixed thresholds and conforming to traffic fluctuation cycles to complete state identification and delineate the boundary between normal and abnormal traffic.

[0017] The identified abnormal traffic segments and their feature vectors are input into a pre-trained classification model to precisely classify different abnormal traffic attack types. This refines the category hierarchy of abnormal traffic, distinguishes the attack attributes corresponding to different network behaviors, and establishes a correspondence between abnormal features and attack types. Based on the attack type, the model matches the corresponding set of defense rules in the protection policy library, completing the adaptation of policy configuration to attack features and forming a standardized rule matching logic.

[0018] The matching set of defense rules is dynamically distributed to the cloud platform traffic scrubbing equipment. The traffic scrubbing equipment filters and processes subsequent network traffic according to the defense rules, enabling remote deployment and immediate effect of the ruled protection strategy. It connects the complete execution link from anomaly detection, type determination, rule matching to traffic scrubbing, and realizes the coordinated operation of network traffic monitoring and security protection, maintaining the standardized operation of the cloud platform network data transmission process. Attached Figure Description

[0019] Figure 1 This is a state diagram of a cloud platform-based data traffic monitoring method according to the present invention. Figure 2 A flowchart generated for the metadata collection; Figure 3 A flowchart for analyzing the improved Isolation Forest algorithm. Detailed Implementation

[0020] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.

[0021] In the description of this invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating orientation or positional relationships, are based on the orientation or positional relationships shown in the accompanying drawings and are only for the convenience of describing the invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of the invention. Furthermore, in the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.

[0022] See Figure 1 This invention provides a data traffic monitoring method based on a cloud platform. The method includes: deploying traffic probes at the entry network node of the cloud platform to collect raw traffic data packets in real time; performing protocol identification and deep analysis on the collected raw traffic data packets to extract a metadata set containing source address, destination address, protocol type, and payload characteristics; aggregating and performing multi-dimensional statistics on the metadata set based on a preset time window to generate a standardized traffic time series; calling an improved isolated forest algorithm to analyze the traffic time series, which can dynamically adjust its isolation threshold according to the periodic pattern of traffic behavior to improve the sensitivity of identifying real anomalies in periodic background traffic; identifying abnormal traffic segments and their corresponding feature vectors based on the output of the improved isolated forest algorithm; inputting the abnormal traffic segments and their feature vectors into a pre-trained classification model, which determines the attack type of the abnormal traffic segments; matching the corresponding set of defense rules from the cloud platform's protection policy library based on the determined attack type; dynamically distributing the generated set of defense rules to the traffic scrubbing device on the cloud platform; and controlling the traffic scrubbing device to filter and process subsequent traffic in real time based on the received set of defense rules.

[0023] In one embodiment of the present invention, protocol identification and deep parsing are performed on the raw traffic data packets to extract metadata sets. See also... Figure 2The raw traffic data packets are decapsulated, stripping away the data link layer and network layer headers to obtain the transport layer protocol header. The transport layer protocol header is parsed to determine the protocol type, source port, and destination port used for the connection, forming the basic information for the connection's five-tuple. For application layer protocols, pattern matching is performed on the payload based on a pre-built protocol feature library to identify the specific application protocol. From the identified application protocol payload, predefined payload features are extracted, including request method, user agent, Uniform Resource Locator (URL), and payload length statistical distribution. The basic information of the connection's five-tuple, the identified specific application protocol, and the extracted payload features are correlated and integrated to form a structured metadata record. All metadata records collected over continuous time are sorted according to their timestamps to form a metadata set for subsequent analysis.

[0024] The metadata set is aggregated based on time windows to generate a traffic time series. A fixed-length or variable-length sliding time window is set. Within each sliding time window, multidimensional statistics are performed on the metadata records in the metadata set. These multidimensional statistics include: calculating connection request frequency by source address, calculating received connection frequency by destination address, calculating the traffic share of each protocol by protocol type, and generating a histogram of feature value distribution by payload characteristics. The multidimensional statistical results within each time window are expanded according to statistical dimensions and arranged strictly in chronological order to form a multidimensional traffic time series. This multidimensional traffic time series is then standardized to eliminate differences caused by different units and magnitudes between different statistical dimensions, generating a standardized traffic time series for subsequent analysis algorithms.

[0025] In practical implementation, the raw traffic data packets undergo protocol identification and deep parsing to extract metadata sets. The raw traffic data packets are collected in real-time by traffic probes deployed at the cloud platform's entry network node. The raw traffic data packets are decapsulated, stripping the data link layer and network layer headers to obtain the transport layer protocol header. The transport layer protocol header is parsed to determine the protocol type, source port, and destination port used for the connection, forming the basic information for the connection's five-tuple. For application layer protocols, pattern matching is performed on the payload portion based on a protocol feature library to identify the specific application protocol. Predefined payload features are extracted from the identified application protocol payload. These payload features include request methods, user agents, Uniform Resource Locators (URLs), and payload length statistical distributions. The basic information of the connection's five-tuple, the specific application protocol, and the extracted payload features are correlated and integrated to form structured metadata records. All metadata records collected over consecutive time are sorted by timestamps to form a metadata set. In some embodiments, the protocol feature library contains signatures for various application protocols, such as Hypertext Transfer Protocol (HTTP) signatures, Domain Name System (DNS) signatures, and File Transfer Protocol (LTP) signatures. Pattern matching is achieved by comparing the payload's starting byte with predefined patterns in the protocol feature library; a successful match identifies the corresponding application protocol. Optionally, the load length statistical distribution is characterized by calculating the mean, variance, and quantiles of the load length. The load length statistical distribution is used to describe the size variation pattern of application layer data.

[0026] In practice, the metadata set is aggregated based on time windows to generate a traffic time series. A sliding time window of fixed or variable length is set. Within each sliding time window, multidimensional statistics are performed on the metadata records in the metadata set. The multidimensional statistics include calculating the frequency of connection requests by source address, calculating the frequency of received connections by destination address, calculating the traffic share of each protocol by protocol type, and calculating the distribution histogram of feature values by payload characteristics. The multidimensional statistical results within each time window are expanded according to statistical dimensions and arranged in chronological order to form a multidimensional traffic time series. The traffic time series is then standardized to eliminate the differences in dimensions between different statistical dimensions, generating a standardized traffic time series for subsequent analysis. In some embodiments, the length of the sliding time window is set to five minutes, and the sliding step size of the time window is set to one minute. When calculating the frequency of connection requests aggregated by source address, for each source address, the number of connection requests initiated by the source address within the time window is counted. When calculating the frequency of received connections aggregated by destination address, for each destination address, the number of connection requests received by the destination address within the time window is counted. When calculating the proportion of traffic for each protocol type, the proportion of connections of each protocol type to the total number of connections is calculated. When calculating the distribution histogram of feature values by load characteristics, for the load length feature, the load length is divided into multiple intervals, and the frequency of load occurrence in each interval is counted. It can be understood that the standardization process uses a min-max scaling method to map the value of each statistical dimension to the interval between zero and one. The formula for standardization is expressed as: ; in: This represents the original statistical value. This represents the minimum value of the original statistical value in the training dataset. This represents the maximum value of the original statistical value in the training dataset. This represents the standardized value. Optionally, the standardization process can also use the Z-score standardization method, with the training dataset consisting of historical normal traffic time series. It can be understood that the generated standardized traffic time series is a multi-dimensional sequence of real-valued vectors, with each time point corresponding to a vector, and each dimension of the vector corresponding to a statistical feature. The standardized traffic time series is used for subsequent analysis using the improved Isolation Forest algorithm.

[0027] In one embodiment of the invention, an improved isolated forest algorithm is invoked to analyze the traffic time series. See also... Figure 3The algorithm receives standardized traffic time series as input data. In the improved Isolation Forest algorithm, multiple binary isolation trees are constructed, each based on a recursive random partition of the input data space. During the recursive partitioning process, the selection probability of partitioning features and partition values is dynamically calculated based on the temporal periodicity of the data points contained in the current node in the traffic time series. Based on the calculated selection probabilities, a partitioning feature and partition value are selected, dividing the current node's data into two child nodes. The partitioning process is recursively executed until a preset stopping condition is met, including the tree reaching its maximum depth or a node containing only a single data point. All constructed binary isolation trees are traversed, and the average path length from the root to the corresponding leaf node for each data point is calculated. This average path length is compared with an isolation threshold dynamically calculated based on the temporal periodicity; data points with path lengths shorter than this threshold are marked as outliers. All marked outliers and their corresponding time series segments in the traffic time series are output as outlier traffic segments.

[0028] In practical implementation, an improved Isolation Forest algorithm is invoked to analyze the traffic time series. A standardized traffic time series is received as input data, which is a real matrix containing N time points, each with D dimensions. In some embodiments, a specific example of the input data is as follows: the vector dimension D for a given time point is 8, where the statistical dimensions include the connection frequency of a specific source address within a unit time window, the connection frequency of a specific destination port, the proportion of Hypertext Transfer Protocol (HTTP) traffic, the average payload length, the payload length variance, and the histogram counts of the payload distribution across three different length intervals. The standardized traffic time series is represented as follows: ,in The multidimensional statistical vector represents the i-th time point.

[0029] In practical implementation, the improved Isolation Forest algorithm constructs multiple binary isolation trees. The construction of each binary isolation tree is based on a recursive random partitioning of the input data space. During this recursive partitioning process, the selection probability of the partitioning feature and partition value is dynamically calculated based on the temporal periodicity intensity of the data points contained in the current node in the traffic time series. The temporal periodicity intensity can be understood as a parameter pre-calculated from historical data analysis, used to quantify the fluctuation pattern of traffic in a corresponding periodic phase (e.g., 9 AM on a weekday). Based on the calculated selection probability, a partitioning feature and partition value are selected to divide the current node's data into two child nodes. For example, for a data point containing data from a specific time point... arrive For the data nodes, the algorithm may, based on periodic intensity, tend to randomly select a partition value (e.g., 0.42) within the normal range of the "Hypertext Transfer Protocol traffic share" dimension (e.g., between 0.3 and 0.5), rather than randomly selecting from the entire feature value range (0 to 1). The partitioning process is recursively executed until a preset stopping condition is met. The stopping condition includes the tree reaching its maximum depth or a node containing only a single data point. Optionally, the maximum depth is set to... ,in It is the subsampling size.

[0030] In practice, all constructed binary isolation trees are traversed, and the average path length from the root to the corresponding leaf node for each data point is calculated. The average path length is the arithmetic mean of the path lengths of the data point across all binary isolation trees. This can be understood as a metric for measuring the degree of "isolation" of a data point. For the i-th point in a standardized traffic time series... The anomaly score or criterion for data points in the improved isolated forest is generated by comparing the average path length with a dynamic isolation threshold. In some embodiments, the average path length... The calculation formula is expressed as: ; in: This represents the total number of binary isolation trees constructed. Representing data points The dynamic isolation threshold is calculated as the number of edges traversed from the root node to its isolated leaf node in the j-th binary isolation tree. The average path length is compared to a dynamically calculated isolation threshold based on the intensity of time periodicity. Data points with path lengths shorter than the isolation threshold are marked as outliers. For example, a traffic data point occurring at 3 AM might have a smaller dynamic isolation threshold than a data point with the same statistical characteristics occurring at 3 PM, because the periodicity of traffic is more stable and the expected fluctuation range is narrower at dawn, thus a relatively short path length may trigger an anomaly labeling. All marked outliers and their corresponding time-series segments in the traffic time series are output as outlier traffic segments. Optionally, an outlier traffic segment is defined as a time interval containing consecutively marked outliers, and this interval is extended forward and backward by a time window to capture the complete context of the outlier event.

[0031] In one embodiment of the invention, the improved isolated forest algorithm dynamically adjusts the isolation threshold based on the periodic patterns of traffic behavior. During the model training phase, historical normal traffic time series are analyzed to detect and quantify their periodic patterns at different time scales, including days, hours, and minutes. For each time scale, a corresponding periodicity intensity coefficient is calculated, which reflects the degree to which the traffic follows regular fluctuations at that time scale. A periodic context vector is maintained for each internal node of each binary isolation tree. The periodic context vector is calculated based on the periodic phase of the timestamps of the data points contained in the node. When partitioning nodes, the dynamic calculation of the partitioning features and the selection probability of the partitioning values is specifically manifested as follows: the partitioning features are randomly selected from all features, but the selection range of the partitioning values is narrowed according to the expected normal traffic range indicated by the periodic context vector, making the partitioning more inclined to group points within the expected range into the same subtree. When identifying anomalies, the dynamically calculated isolation threshold is as follows: For the data point to be tested, the expected baseline path length is obtained according to the periodic stage of its timestamp. The expected baseline path length is obtained by statistically analyzing the path length of normal data in the same historical stage. The expected baseline path length is multiplied by a relaxation or contraction coefficient adjusted according to the current periodic intensity coefficient, which is used as the dynamic isolation threshold of the data point to be tested.

[0032] In its implementation, the improved Isolation Forest algorithm dynamically adjusts the isolation threshold based on the periodic patterns of traffic behavior. During the model training phase, it analyzes historical normal traffic time series to detect and quantify the periodic patterns of these time series at different time scales. Historical normal traffic time series are standardized traffic time series data labeled as normal over a past period, with time scales including days, hours, and minutes. For each time scale, a corresponding periodicity intensity coefficient is calculated, reflecting the degree to which traffic follows a regular fluctuation over the time scale. In some embodiments, the periodicity intensity coefficient is calculated by analyzing the autocorrelation or spectral peaks of the historical normal traffic time series at the corresponding time scale. A higher periodicity intensity coefficient indicates that the traffic has a strong and predictable recurring pattern at that time scale. For example, analysis of the "day" time scale might reveal that weekday traffic always peaks between 9 AM and 10 AM and always troughs between 2 AM and 4 AM; the strength of this pattern is quantified by the periodicity intensity coefficient at the "day" scale. Optionally, the periodicity intensity coefficient can be obtained by calculating the ratio of the power spectral density to the background noise corresponding to a specific period length.

[0033] A periodic context vector is maintained for each internal node of each binary isolation tree. The periodic context vector is calculated based on the periodic phase of the timestamps of the data points contained in the node. The periodic context vector is a multi-dimensional vector with the same dimension as the feature dimension of the traffic time series. Each element in the vector represents the median or expected value of the historical normal range of the corresponding feature in a specific periodic phase. During node partitioning, the dynamic calculation of the partitioning features and the probability of selecting partitioning values is specifically manifested as follows: the partitioning features are randomly selected from all features, but the selection range of partitioning values is narrowed according to the expected normal traffic range indicated by the periodic context vector, making the partitioning more inclined to group points within the expected range into the same subtree. For example, for a node whose data point timestamps all correspond to "10 AM on a weekday," the periodic context vector indicates that the expected normal range for the "requests per second" dimension is [100, 200]. When partitioning, the algorithm randomly selects a partition value in the dimension of "requests per second". Instead of uniformly selecting from the entire feature value range [0,1000], it selects from the interval [100,200] with a higher probability, or completely restricts the selection to this interval. This makes it easier for traffic points that belong to the expected normal range to stay in the same subtree, with a relatively longer path length. Points that fall outside the range [100,200] are easier to isolate quickly, with a shorter path length. This enhances the sensitivity to data points that deviate from the periodic normal pattern.

[0034] In practical implementation, the dynamically calculated isolation threshold is as follows: For the data point to be tested, the expected baseline path length is obtained based on the periodic phase of its timestamp. The expected baseline path length is obtained by statistically analyzing the path lengths of normal data in the same historical phase. The expected baseline path length represents the average path length obtained by a normal data point in the improved isolation forest during a specific periodic phase (e.g., "Wednesday at 3 PM"). It can be understood that the expected normal path length differs in different periodic phases due to varying baseline traffic levels. The expected baseline path length is multiplied by a coefficient adjusted according to the current periodicity intensity coefficient to obtain the dynamic isolation threshold for the data point to be tested. The formula for calculating the dynamic isolation threshold can be expressed as: ; in: This indicates that for data points The dynamic isolation threshold, It is a function whose output is data points. The specific periodic stage corresponding to the timestamp. Indicating a periodic phase Baseline path length expectation, Represents a periodic intensity coefficient relative to the current dominant timescale. The relevant adjustment factors. This can be understood as adjustment factors. It is a monotonically decreasing function. When the periodicity intensity coefficient is high, it indicates that the flow pattern is highly regular and the expected range is narrow, and the adjustment coefficient is low. A smaller value results in a more stringent (smaller) dynamic isolation threshold, making it more sensitive to deviations; when the periodicity intensity coefficient is low, the adjustment coefficient... A larger value results in a more lenient (larger) dynamic isolation threshold. See Table 1, which illustrates the mapping relationship between a periodic intensity coefficient and an adjustment coefficient.

[0035] Table 1: Mapping Table of Periodicity Intensity Coefficient Range and Adjustment Coefficient β

[0036] In some embodiments, the baseline path length expectation This is obtained by querying a pre-calculated lookup table that stores the average historical normal data path length for each periodic phase. (Data points) The average path length calculated in the improved isolated forest Its dynamic isolation threshold If a comparison is made, Then data points Marked as an outlier. Optional adjustment factor. It can also be designed as a continuous function, for example ,in It is the periodic intensity coefficient normalized to the [0,1] interval.

[0037] In one embodiment of the present invention, abnormal traffic fragments and their feature vectors are input into a pre-trained classification model to determine the attack type to which the abnormal traffic fragments belong. Complete original traffic data packets corresponding to the abnormal traffic fragments are extracted from the historical cache of the traffic probe. Session reassembly is performed on the complete original traffic data packets to reconstruct the complete network session flow. Deep session features are extracted from the reassembled network session flow, including data packet arrival time interval sequences, payload size variation patterns, and specific attack payload fingerprints. The feature vectors corresponding to the abnormal traffic fragments and the extracted deep session features are combined to form a comprehensive feature vector for this abnormal event. This comprehensive feature vector is input into a pre-trained classification model, which is a multi-classification model trained based on labeled historical attack traffic samples. The pre-trained classification model outputs the probability distribution of the comprehensive feature vector belonging to various preset attack types. The attack type with the highest probability value is determined as the attack type to which the abnormal traffic fragment belongs.

[0038] Based on the attack type, the corresponding set of defense rules is matched from the cloud platform's protection policy library. The cloud platform's protection policy library stores mappings between various attack types and defense rule templates. Based on the determined attack type, one or more corresponding defense rule templates are retrieved. Key identifying information is extracted from abnormal traffic segments and network session flows, including malicious source addresses, target ports, and attack payload signatures. This key identifying information is used as parameters to populate the retrieved defense rule templates, generating specific, executable defense rules. These defense rules include: access control list rules, used to block traffic from specific source addresses on firewalls or routers; intrusion prevention system rules, used to match and drop packets containing specific attack signatures on deep inspection devices; and traffic shaping rules, used to limit the rate of traffic sent to specific target ports. All generated defense rules for this abnormal event are combined to form a defense rule set.

[0039] In practice, abnormal traffic fragments and their feature vectors are input into a pre-trained classification model to determine the attack type of the abnormal traffic fragments. The complete original traffic data packets corresponding to the abnormal traffic fragments are extracted from the historical cache or persistent storage of the traffic probe. Abnormal traffic fragments are identified by time windows, and all original traffic data packets flowing through the probe within that time range are retrieved from storage using timestamp ranges. Session reassembly is performed on the complete original traffic data packets to reconstruct the complete network session flow. Session reassembly sorts, deduplicates, and concatenates data packets belonging to the same communication process based on network layer and transport layer information (such as quintuples) to form a complete, ordered bidirectional data stream sequence, i.e., the network session flow. Deep session features are extracted from the network session flow, including data packet arrival time interval sequences, payload size variation patterns, and specific attack payload fingerprints. The data packet arrival time interval sequence records the time difference between the arrival of consecutive data packets in the session; the payload size variation pattern describes the variation pattern of the payload length of consecutive data packets; and the specific attack payload fingerprint is obtained by matching predefined attack feature patterns, such as matching machine code sequences of known buffer overflow attacks or specific string patterns of structured query language injection attacks. By combining the feature vectors corresponding to the abnormal traffic segments with the extracted deep session features, a comprehensive feature vector for this abnormal event is constructed. This comprehensive feature vector is a high-dimensional vector composed of two parts: the first part comes from the feature vectors corresponding to the abnormal traffic segments, and the second part comes from the deep session features (such as the mean and variance of time intervals, statistical moments of payload size, and scalar representations of fingerprint matching results). In essence, the comprehensive feature vector integrates anomaly indicators from the traffic statistics level with micro-level evidence from the session behavior level, providing more comprehensive information for accurate classification.

[0040] In practice, the comprehensive feature vector is input into a pre-trained classification model, which is a multi-classification model trained based on labeled historical attack traffic samples. The labeled historical attack traffic samples include network traffic data of known attack types and their manually or automatically labeled attack type tags. The pre-trained classification model can be a machine learning model such as a gradient boosting decision tree, a deep neural network, or a support vector machine. The pre-trained classification model outputs a probability distribution of the comprehensive feature vector belonging to various preset attack types. The probability distribution is a vector, where each element corresponds to a preset attack type, and the element value represents the posterior probability that the comprehensive feature vector belongs to that attack type. The sum of all element values is 1. The attack type with the highest probability value is determined as the attack type to which the abnormal traffic segment belongs. For example, if the preset attack types include Distributed Denial-of-Service (DDoS) attacks, Structured Query Language (SCL) injection attacks, Cross-Site Scripting (XSS) attacks, and brute-force attacks, the probability distribution calculated by the pre-trained classification model for a comprehensive feature vector might be: DDoS attack probability 0.05, SCL injection attack probability 0.90, XSS attack probability 0.03, and brute-force attack probability 0.02. In this case, "SCL injection attack" would be classified as the attack type of the abnormal traffic segment. Optionally, a probability threshold can be set for the classification process. Only when the highest probability value exceeds this threshold is a definitive classification made; otherwise, the attack type is marked as "unknown."

[0041] In practice, based on the attack type, the corresponding set of defense rules is matched from the cloud platform's protection policy library. This library stores mappings between various attack types and defense rule templates. A defense rule template is a parameterized rule description containing placeholder variables to be filled. See Table 2 for a simplified mapping table of attack types and defense rule templates.

[0042] Table 2: Mapping Table of Attack Types and Defense Rule Templates

[0043] Based on the determined attack type, one or more corresponding defense rule templates are retrieved. Key identification information is extracted from abnormal traffic segments and network session flows, including the malicious source address, the target port, and the attack payload signature. In some embodiments, the malicious source address can be extracted from frequently occurring source addresses in the metadata set corresponding to the abnormal traffic segments, or directly obtained from the session that triggered the anomaly. The target port is obtained from the destination port field of the network session flow, and the attack payload signature is obtained from a specific attack payload fingerprint matched in deep session features. The key identification information is used as parameters to populate the retrieved defense rule templates, generating specific, executable defense rules. These defense rules include: access control list rules for blocking traffic from specific source addresses on firewalls or routers; intrusion prevention system rules for matching and discarding packets containing specific attack signatures on deep inspection devices; and traffic shaping rules for limiting the rate of traffic destined for specific target ports. For example, for an event identified as a "Structured Query Language Injection Attack," if the attack payload signature is extracted as "OR'1'='1," the generated intrusion prevention system rule might be: DROPHTTPREQUESTWHERE PAYLOADCONTAINS"OR'1'='1". It can be understood that the generated defense rule is a device-resolvable configuration instruction. All generated defense rules for this abnormal event are combined to form a defense rule set. This set may contain multiple rules targeting different defense points or employing different defense actions. For example, for a distributed denial-of-service attack, the defense rule set might simultaneously include an access control list rule that blocks the source address range on the border router and a rule that performs traffic shaping on the target address on the load balancer. Optionally, the extraction of key identification information can follow a predefined extraction rule, which can be expressed as: ; in: Indicates the extraction function, This indicates abnormal events, including abnormal traffic segments and network session flows. Indicates the type of attack being determined. Indicates from the event Find the source address that appears most frequently. Indicates from the event Find the most frequently occurring destination port. This represents the attack type extracted from deep session features. The corresponding attack payload signature is extracted, and the output of the extraction function is the set of key identification information. In some embodiments, for attack events from which a valid attack payload signature cannot be extracted, The value can be set to empty or a generic identifier.

[0044] In one embodiment of the present invention, a set of defense rules is dynamically distributed to traffic scrubbing devices on a cloud platform. The set of defense rules is converted into a configuration instruction format recognizable by the target traffic scrubbing devices on the cloud platform. These traffic scrubbing devices include distributed firewalls, web application firewalls, and intrusion prevention systems. The formatted configuration instructions are asynchronously distributed to the relevant traffic scrubbing devices through the unified configuration management channel of the cloud platform. The deployment status of the configuration instructions on the target traffic scrubbing devices is monitored to ensure successful rule activation. After successful rule distribution, the global policy status table of the cloud platform is updated to record the effective time, scope, and expected validity period of the defense rule set. The traffic scrubbing devices are controlled to filter and process subsequent traffic according to the set of defense rules. On the data forwarding path, the traffic scrubbing devices perform real-time matching checks on the data packets flowing through them based on the effective set of defense rules. For data packets matching blocking rules, a drop operation is performed and a security event log is generated. For data streams matching rate limiting rules, their traffic rate is constrained to below the threshold set by the rule. The security event logs and traffic processing statistics are reported to the monitoring center of the cloud platform in real time. At the monitoring center, the reported handling information is correlated with the initially detected abnormal traffic segments to verify the effectiveness of the defense rules.

[0045] In practice, the set of defense rules is dynamically distributed to the traffic scrubbing devices on the cloud platform. This set consists of multiple specific, executable defense rules, such as an access control list rule and an intrusion prevention system rule. The set of defense rules is then converted into a configuration command format recognizable by the target traffic scrubbing devices on the cloud platform. The target traffic scrubbing devices are the network or security devices on the cloud platform that actually perform traffic filtering and processing, including distributed firewalls, web application firewalls, and intrusion prevention systems. The configuration command format is a command or data format natively supported by the target traffic scrubbing devices or that can be parsed through their management interfaces. For example, for a distributed firewall, the configuration command format might be a command-line interface command; for a web application firewall, the configuration command format might be a JSON object conforming to its application programming interface specification. The formatted configuration commands are asynchronously distributed to the relevant traffic scrubbing devices through the unified configuration management channel of the cloud platform. The cloud platform's unified configuration management channel is a centralized and secure communication and configuration distribution service responsible for reliably transmitting and pushing configuration commands to designated target devices. Asynchronous delivery means that the command delivery operation will not block the main process of the monitoring system. After the delivery request is submitted, the system continues to execute subsequent tasks, and the delivery result is obtained through callback or status query mechanisms. The system monitors the deployment status of configuration commands on the target traffic scrubbing device to ensure successful rule activation. In some embodiments, deployment status monitoring is accomplished by sending configuration query commands to the target traffic scrubbing device or parsing the configuration confirmation message returned by the target traffic scrubbing device. Deployment status includes "delivering," "effective," and "failed." When the status is detected as "failed," an alarm or retry mechanism can be triggered. After successful rule delivery, the cloud platform's global policy status table is updated to record the effective time, scope, and expected validity period of the defense rule set. The global policy status table is a database table that centrally stores all effective defense rules and their metadata. The scope describes the network address, port, or protocol to which the rule applies, and the expected validity period defines the time when the rule automatically expires to prevent outdated rules from remaining in the system for a long time and affecting business operations.

[0046] In practice, the traffic scrubbing device filters and processes subsequent traffic based on a set of defense rules. Along the data forwarding path, the device performs real-time matching checks on incoming data packets based on the effective set of defense rules. The data forwarding path is the internal processing pipeline that network traffic must traverse; for example, inbound traffic passes through the traffic scrubbing device for inspection before reaching the protected server. For data packets matching blocking rules, a drop operation is performed, and a security event log is generated. Blocking rules are typically access control list rules or intrusion prevention system rules defined as "deny" or "drop." Dropping means directly discarding the data packet without forwarding it. The security event log records key information about the dropped data packets (such as timestamp, source address, destination address, and matching rule identifier) and the drop action. For data flows matching rate limiting rules, their traffic rate is constrained to below the threshold set by the rule. The traffic scrubbing device uses algorithms such as token bucket or leaky bucket to measure and shape the rate of data flows matching specific characteristics (such as target port). Traffic exceeding the threshold is dropped or delayed. It is understandable that the filtering and processing operations are executed in real time on the traffic scrubbing device based on the specific instructions in the set of defense rules, without the need for additional intervention from the cloud monitoring system.

[0047] Security event logs and traffic handling statistics are reported to the cloud platform's monitoring center in real time. Security event logs record detailed entries for each rule matching and handling action. Traffic handling statistics are periodically aggregated data, such as the total number of dropped packets and the number of connections subject to rate limiting per unit time. The reporting process is completed either through the log sending function built into the traffic scrubbing device or by a proxy from the cloud platform's configuration management channel. At the monitoring center, the reported handling information is correlated with the initially detected abnormal traffic segments to verify the effectiveness of the defense rules. In some embodiments, the correlation analysis is achieved by comparing the time window, source address, and destination address of the abnormal traffic segments with records in the security event logs. Verifying the effectiveness of the defense rules can be done by analyzing whether subsequent attack traffic from the same malicious source is successfully intercepted after the rules take effect, or whether the abnormal traffic indicators decrease significantly. Optionally, the results of the correlation analysis can be used to calculate a defense effectiveness evaluation index. The defense effectiveness evaluation index (η) can be defined as the ratio of the number of malicious connections intercepted within a specific time window after the rules take effect to the number of malicious connections detected within the same time window before the rules take effect, expressed by the formula: ; in: Indicates the evaluation index of defense effectiveness. This indicates the number of interceptions reported by the traffic scrubbing device that match the characteristics of this attack within a specified time window after the defense rules take effect. This indicates the number of malicious connections or requests detected in the abnormal traffic segment that triggered this alert before the defense rules were generated. This can be understood as a metric for evaluating defense effectiveness. A value close to 1 indicates that the defense rule effectively intercepted subsequent attacks; if the value of the defense effectiveness evaluation indicator is close to 0, it may indicate that the attack characteristics have changed or the rule is not correctly matched. In some embodiments, the verification results can be fed back to the policy generation module to guide the optimization of future defense rule templates or the adjustment of key identification information extraction strategies. Optionally, the expected validity period can be dynamically calculated based on the historical duration of the attack type and the results of correlation analysis. For example, for short-duration burst scanning attacks, the expected validity period can be set to a few minutes; for persistent distributed denial-of-service attacks, the expected validity period can be set to several hours.

[0048] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention in any other way. Any person skilled in the art may make changes or modifications to the above-disclosed technical content to create equivalent embodiments that can be applied to other fields. However, any simple modifications, equivalent changes, and modifications made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the protection scope of the present invention.

Claims

1. A data traffic monitoring method based on a cloud platform, characterized in that, include: Deploy traffic probes at the entry network nodes of the cloud platform to collect raw traffic data packets in real time; The original traffic data packets are subjected to protocol identification and deep parsing to extract a set of metadata containing source address, destination address, protocol type, and payload characteristics; The metadata set is aggregated based on a time window to generate a traffic time series; The improved isolated forest algorithm is invoked to analyze the traffic time series, and the improved isolated forest algorithm dynamically adjusts the isolation threshold based on the periodic pattern of traffic behavior; Based on the output of the improved isolated forest algorithm, abnormal traffic segments and their corresponding feature vectors are identified. The abnormal traffic fragments and their feature vectors are input into a pre-trained classification model to determine the attack type to which the abnormal traffic fragments belong; Based on the attack type, the corresponding set of defense rules is matched from the cloud platform's protection policy library; The set of defense rules is dynamically distributed to the traffic scrubbing equipment on the cloud platform; The flow cleaning device is controlled to filter and process subsequent flow according to the set of defense rules.

2. The data traffic monitoring method based on a cloud platform according to claim 1, characterized in that, The original traffic data packets are subjected to protocol identification and deep parsing to extract a metadata set containing source address, destination address, protocol type, and payload characteristics, including: The original traffic data packets are decapsulated to remove the data link layer and network layer header information and obtain the transport layer protocol header. Parse the transport layer protocol header to determine the protocol type, source port, and destination port used for the connection, thus forming the basic information of the connection 5-tuple; For application layer protocols, pattern matching is performed on the payload based on the protocol feature library to identify the specific application protocol. From the identified application protocol payloads, predefined payload features are extracted, including request method, user agent, Uniform Resource Locator, and payload length statistical distribution. The basic information of the connection quintuple, the specific application protocol, and the extracted payload features are associated and integrated to form a structured metadata record; All the metadata records collected over a continuous period of time are sorted by timestamp to form the metadata set.

3. The data traffic monitoring method based on a cloud platform according to claim 1, characterized in that, The aggregation of the metadata set based on a time window to generate a traffic time series includes: Set a sliding time window of fixed or variable length; Within each sliding time window, multidimensional statistics are performed on the metadata records in the metadata set; The multidimensional statistics include: calculating connection request frequency by source address, calculating received connection frequency by destination address, calculating the traffic share of each protocol by protocol type, and calculating a histogram of feature value distribution by load characteristics. The multidimensional statistical results within each time window are expanded according to statistical dimensions and arranged in chronological order to form a multidimensional traffic time series. The traffic flow time series is standardized to eliminate the differences in dimensions between different statistical dimensions, generating a standardized traffic flow time series for subsequent analysis.

4. The data traffic monitoring method based on a cloud platform according to claim 3, characterized in that, The analysis of the traffic time series by invoking the improved isolated forest algorithm includes: Receive the standardized traffic time series as input data; In the improved isolated forest algorithm, multiple binary isolation trees are constructed, and the construction of each tree is based on a recursive random partitioning of the input data space; During the recursive partitioning process, the partitioning features and the probability of selecting partitioning values are dynamically calculated based on the time periodicity intensity of the data points contained in the current node in the traffic time series. Based on the selection probability, a partitioning feature and a partitioning value are selected to divide the current node data into two child nodes; The partitioning process is executed recursively until a preset stopping condition is met. The stopping condition includes the tree reaching its maximum depth or a node containing only a single data point. Traverse all constructed binary isolation trees and calculate the average path length from the root to the corresponding leaf node for each data point; The average path length is compared with an isolation threshold dynamically calculated based on the time periodicity intensity. Data points with path lengths shorter than the isolation threshold are marked as outliers. Output all marked outliers and their corresponding time segments in the traffic time series as the outlier traffic segments.

5. The data traffic monitoring method based on a cloud platform according to claim 4, characterized in that, The improved isolated forest algorithm dynamically adjusts the isolation threshold based on the periodic pattern of traffic behavior. Its working principle includes: During the model training phase, historical normal traffic time series are analyzed to detect and quantify their periodic patterns at different time scales, including days, hours, and minutes. For each time scale, the corresponding periodicity intensity coefficient is calculated, which reflects the degree to which the flow rate follows regular fluctuations over time. A periodic context vector is maintained for each internal node of each binary isolation tree. The periodic context vector is calculated based on the periodic phase of the timestamp of the data points contained in the node. When dividing nodes, the dynamic calculation of the selection probability of the division features and the division values is as follows: the division features are randomly selected from all features, but the selection range of the division values will be narrowed according to the expected normal flow range indicated by the periodic context vector, so that the division is more inclined to classify the points within the expected range into the same subtree. When determining an anomalies, the dynamically calculated isolation threshold is specifically as follows: for the data point to be tested, the expected baseline path length is obtained according to the periodic stage of its timestamp. The expected baseline path length is obtained by statistically analyzing the path lengths of normal data in the same historical stage. The expected baseline path length is multiplied by a coefficient adjusted according to the current periodic intensity coefficient and used as the dynamic isolation threshold of the data point to be tested.

6. The data traffic monitoring method based on a cloud platform according to claim 1, characterized in that, The abnormal traffic segments and their feature vectors are input into a pre-trained classification model to determine the attack type to which the abnormal traffic segments belong, including: Extract the complete original traffic data packet corresponding to the abnormal traffic segment from the traffic probe; Session reassembly is performed on the complete original traffic data packets to restore the complete network session flow; Deep session features are extracted from the network session stream, including packet arrival time interval sequences, payload size variation patterns, and specific attack payload fingerprints. The feature vector corresponding to the abnormal traffic segment is combined with the extracted deep session features to form a comprehensive feature vector for this abnormal event; The comprehensive feature vector is input into the pre-trained classification model, which is a multi-classification model trained based on labeled historical attack traffic samples. The pre-trained classification model outputs the probability distribution of the comprehensive feature vector belonging to each preset attack type; The attack type with the highest probability value is determined as the attack type to which the abnormal traffic segment belongs.

7. A data traffic monitoring method based on a cloud platform according to claim 6, characterized in that, Based on the attack type, a corresponding set of defense rules is matched from the cloud platform's protection policy library, including: The cloud platform protection strategy library stores the mapping relationship between various attack types and defense rule templates; Based on the determined attack type, retrieve one or more corresponding defense rule templates; Extract key identification information from the abnormal traffic segments and the network session stream. The key identification information includes the malicious source address, the target port of the attack, and the attack payload signature. The key identification information is used as a parameter to fill the retrieved defense rule template, thereby generating specific and executable defense rules; The defense rules include: access control list rules, used to block traffic from a specific source address on a firewall or router; intrusion prevention system rules, used to match and drop packets containing specific attack signatures on deep inspection devices; and traffic shaping rules, used to limit the rate of traffic sent to a specific target port. All the defense rules generated for this abnormal event are combined to form the defense rule set.

8. A data traffic monitoring method based on a cloud platform according to claim 7, characterized in that, The set of defense rules is dynamically distributed to the traffic scrubbing equipment on the cloud platform, including: The set of defense rules is converted into a configuration instruction format that can be recognized by the target traffic scrubbing device in the cloud platform. The traffic scrubbing device includes a distributed firewall, a web application firewall, and an intrusion prevention system. The formatted configuration commands are asynchronously sent to the relevant traffic scrubbing devices through a unified configuration management channel on the cloud platform. Monitor the deployment status of the configuration instructions on the target traffic scrubbing device to ensure that the rules are successfully activated; After the rules are successfully issued, the global policy status table of the cloud platform is updated to record the effective time, scope of application, and expected validity period of the defense rule set.

9. A data traffic monitoring method based on a cloud platform according to claim 8, characterized in that, Controlling the flow cleaning device to filter and process subsequent flow according to the set of defense rules includes: The traffic scrubbing device performs real-time matching and checking of data packets flowing through the data forwarding path based on the set of effective defense rules. For packets that match the blocking rule, drop them and generate a security event log. For data streams that match the rate limiting rule, their traffic rate is constrained to below the threshold set by the rule; The security event logs and traffic handling statistics are reported to the cloud platform's monitoring center in real time. At the monitoring center, the reported handling information is correlated with the initially detected abnormal traffic segments to verify the effectiveness of the defense rules.

10. A cloud-based data traffic monitoring system, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the data traffic monitoring method based on a cloud platform as described in any one of claims 1 to 9.