A network traffic generation method based on conditional diffusion model and fence model

By combining the conditional diffusion model and the barrier model, the periodic and burst components of network traffic data are decomposed and generated, solving the problems of network management latency and inaccurate generation in the existing technology, and realizing high-fidelity network traffic generation.

CN122247848APending Publication Date: 2026-06-19NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Filing Date
2026-03-12
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing network management methods rely on static models, which cannot effectively adapt to the high dynamism of modern mobile networks, leading to management delays and inaccurate generation.

Method used

We employ a diffusion-fence ensemble model (DHEM) combined with a conditional diffusion model (CDDPM) and a fence model (HM) to decompose network traffic data into periodic and burst components. We generate periodic traffic through CDDPM and burst traffic through HM, respectively, and improve the generation accuracy by utilizing a cross-attention mechanism and a classifier-free guided method.

Benefits of technology

It significantly improves the accuracy and fidelity of network traffic generation, effectively captures the characteristics of periodic and sudden events, and supports more efficient network optimization and management.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122247848A_ABST
    Figure CN122247848A_ABST
Patent Text Reader

Abstract

This invention discloses a network traffic generation method based on a conditional diffusion model and a barrier fence model. The method proposes a diffusion-barrier fence model integrated framework for generating network traffic data with burst traffic characteristics. This framework combines the conditional diffusion model and the barrier fence model to generate mobile traffic with burst patterns. The method first decomposes the traffic into periodic and burst components. The conditional diffusion model learns the characteristics of the periodic component, while the barrier fence model captures the sparsity and randomness of the burst component. The final traffic is obtained by aggregating the outputs of the two models. Extensive experimental results show that this method significantly outperforms baseline models in generating high-fidelity traffic containing burst traffic.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of network traffic data condition generation, and specifically relates to a method for generating network traffic conditions with burst patterns. Background Technology

[0002] Mobile network traffic data contains a wealth of useful information, revealing network performance and potential patterns in human communication needs. For network operators, interpreting this traffic data as user demand helps in effective network planning and optimization. Network management strategies driven by network traffic data, such as optimizing the location of new base stations and dynamically adjusting base station sleep schedules to improve energy efficiency, can improve network performance and enhance user experience.

[0003] Traditional network management is reactive, relying on static models to detect changes before optimization. This approach suffers from significant management latency and cannot adapt to the highly dynamic nature of modern mobile networks. To overcome this problem, it is necessary to develop proactive intelligent network management. This approach requires a high-fidelity virtual environment as a reliable platform for developing and testing advanced management strategies.

[0004] The accuracy of network traffic models is fundamental to this proactive network management strategy, determining the reliability of all simulations and the effectiveness of derived strategies. The need for accuracy has driven current research towards high-fidelity generative modeling, moving away from traditional traffic prediction.

[0005] Advances in generative artificial intelligence have shifted current research focus to learning the complete statistical distribution of traffic data, rather than simply generating copies. Generative Adversarial Networks (GANs) are among the earliest deep generative techniques applied to this field. For example, hierarchical GANs are used for knowledge-enhanced traffic generation. Recently, diffusion models have become a very successful research direction in generating network traffic.

[0006] The barrier fence model is a statistical tool used to analyze zero-inflated count data containing relevant variables. Traffic surges in mobile networks exhibit similar behavior: while networks typically follow a stable, periodic baseline, traffic surges appear as sporadic and aperiodic events. This inherent bimorphic characteristic makes the barrier fence model a powerful technique for accurately capturing these aperiodic features. In turn, this promises to significantly improve the generation fidelity of burst events.

[0007] Inspired by the above analysis, this invention proposes a novel generative framework designed to capture periodic patterns and bursts of variation in mobile traffic data. The proposed diffusion-fence ensemble model addresses the complexity of network traffic by decomposing total traffic data into periodic and burst components. To reconstruct the periodic components, a conditional diffusion model is used, which progressively removes traffic noise from Gaussian noise, thus achieving context-aware generation. Given the inherent sparsity and randomness of the burst components, a fence model is designed to achieve their accurate generation. The final synthetic traffic is obtained by superimposing the outputs of the conditional diffusion model and the fence model. Summary of the Invention

[0008] The purpose of this invention is to address the problem of high-fidelity network traffic condition generation by proposing a network traffic generation method based on a conditional diffusion model and a hurdle model. This invention proposes a Diffusion-Hurdle Ensemble Model (DHEM), which combines the Conditional Denoising Diffusion Probabilistic Model (CDDPM) with a hurdle model (HM) to generate mobile traffic with burst patterns. This aims to enhance the network traffic generation model's ability to capture periodic and burst component characteristics and improve generation accuracy. To achieve this objective, the steps employed in this invention are as follows:

[0009] Step 1: Divide the raw network traffic data into periodic traffic and burst traffic. Network traffic data values ​​higher than 99% are defined as burst traffic and quantified as the positive residual between the raw traffic and a linear interpolation baseline established using the remaining non-burst data points. Subtracting the periodic traffic portion from the raw network traffic data yields the burst traffic data.

[0010] Let x = {x1, x2, ..., xn} N Let} represent the original network traffic data, and let Tr be the 99th percentile of the traffic data. Define the set of non-burst points S = {x | x < Tr}. Obtain the periodic traffic component x by performing linear interpolation on the points in the set of non-burst points S. p :

[0011] x p =L(S), (1)

[0012] Where L(·) represents a linear interpolation function. By subtracting the periodic portion from the original flow data, the burst flow data x is obtained. b :

[0013] x b =xxp (2)

[0014] Step 2: A CDDPM model is used to train and generate periodic network traffic data. The CDDPM model guides the generation process by introducing external conditional information, thereby achieving controllable data synthesis. Its forward process is mathematically identical to the standard denoising diffusion model. However, its backward process utilizes conditional data in each iteration step, making the model's output not only similar to the training data but also conforming to the specified conditions.

[0015] (1) Forward Process: Given periodic flow data x0~q(x), the forward diffusion process of the CDDPM model can be described as a Markov chain, which progressively adds Gaussian noise to the data x0. The forward process can be expressed as:

[0016]

[0017] Where T represents the total number of diffusion steps, β t ∈(0,1) represents the noise variance introduced at each step, whose value increases with t, i.e., β t <β t+1 N represents the normal distribution, and I represents the identity matrix.

[0018] To model any step t directly using x0 without iterative computation, the data x at any step t... t It can be represented as:

[0019]

[0020] Where, α t =1-β t ,

[0021] (2) Reverse Process: The reverse denoising process of the CDDPM model constitutes a reverse Markov chain structure, which consists of conditional information. The process guides and gradually recovers the original data distribution x0 from Gaussian noise. A parameterized denoising network is trained to approximate the true reverse process by learning to predict the noise component at each step. Therefore, the reverse process of the CDDPM model can be calculated as follows:

[0022]

[0023] Where θ d This represents the parameters of the denoising network. The mean of the predictions can be calculated by maximizing the lower bound of the log-likelihood function. and variance

[0024]

[0025] The denoising network model is trained using the following objective function:

[0026]

[0027] in, This represents the noise estimated by the denoising network.

[0028] The CDDPM model uses multiple residual layers as its denoising network. To reduce the risk of overfitting and enhance the controllability of CDDPM model generation, a classifier-free guidance method is used during training. This involves replacing conditional information with zero vectors with a certain probability p during training. In the backpropagation process, the final noise prediction value... Calculated as conditional output and unconditional output The linear combination of the denoising network in the CDDPM model, controlled by an adjustable guiding scale w, is calculated as follows:

[0029]

[0030] Simultaneously, a fusion module based on a cross-attention mechanism is employed to extract the implicit correlation between guidance information and network traffic; latent vector x t Used for querying Q t Conditional information c i Also used as key K t Sum V t , can be represented as:

[0031] Q t =W Q x t K t =W K c i V t =W V i (11)

[0032] Among them W Q W K W V These are the weight matrices; the output of the cross-attention mechanism is calculated as follows:

[0033]

[0034] The denoising network first calculates a cross-attention score between the noisy traffic features and the conditional embedding, thereby achieving feature fusion. These scores are then fed into a feedforward network to output modulated features. This mechanism allows the denoising network to selectively focus on the most relevant parts of the conditional signal when extracting noisy traffic features.

[0035] Step 3: Train an HM model to generate bursty network traffic data. The HM model consists of two independent decision-making phases. First, a binary model is used to distinguish between zero and positive values. Then, for positive results, a count distribution truncated at zero is used for modeling. Given a set of covariates c, the HM model can be represented as:

[0036]

[0037] Among them, y t The observed value at time t, this probability mass function f(y) t λ(c) describes the distribution of burst traffic events, where λ(c) is the parameter of the zero-cutoff distribution and is defined as a function of the covariate c.

[0038] Indicator Function The method used to indicate whether a surge occurs at time t is as follows:

[0039]

[0040] Based on this, we can calculate y under covariate c. t The probability distribution is shown below:

[0041]

[0042] Non-occurrence probability π t (c)=P(y t =0|c) by classification neural network Predict, where θ c This represents the parameters of the classification network. For positive burst amplitude values, the mean parameter λ(c) of the zero-cutoff distribution is given by the regression neural network g. r Predict, where θ r This represents the parameters of the regression network. The model is trained by minimizing the following overall loss function:

[0043]

[0044] The first two terms constitute the binary cross-entropy loss, used to optimize the classification network hθ. c The parameters are used to predict the probability π that the sudden event will not occur. t The third term is the negative log-likelihood loss with zero truncation, used to optimize the regression network gθ. ,To learn effective strategies for predicting positive burst flow magnitude.

[0045] Step 4: The final network traffic data is obtained by adding the outputs of the CDDPM model and the HM model. The complete network traffic data integration process consists of two generation stages. First, Gaussian noise and guiding conditions are input into the parameter θ. d In the trained denoising network, periodic traffic is generated. Then, the parameters are θ. c and θ r The trained classification and regression networks synthesize burst traffic using conditional information. The final traffic data output is defined as the sum of the periodic traffic component and the burst traffic. Attached Figure Description

[0046] Figure 1 It is a DHEM model architecture diagram used to generate bursty network traffic;

[0047] Figure 2 This is a block diagram of the conditional diffusion model proposed in this invention for training and generating predicted noise;

[0048] Figure 3 This is a block diagram of the fence model training and burst traffic generation proposed in this invention;

[0049] Figure 4 This is a time-domain rendering of the network traffic generated by this invention;

[0050] Figure 5 This is a frequency domain effect diagram of the network traffic generated by this invention. Detailed Implementation

[0051] The present invention will now be described in further detail with reference to the accompanying drawings and embodiments.

[0052] In the following description, this specification will present a network traffic generation method based on a conditional diffusion model and a barrier model, as proposed in this invention. Figure 1 As shown, this invention proposes a DHEM model that combines the CDDPM model with the HM model to generate mobile traffic with burst patterns. The framework consists of two components, each handling different traffic data characteristics. The CDDPM model generates predictable periodic traffic, while the HM model generates bursty traffic components. The final synthesized traffic is obtained by adding the outputs of these two models.

[0053] The DHEM model first sets the following training and generation conditions:

[0054] In the experiment, the continuous time domain of one week was discretized into 168 fine-grained time segments. Each dataset was divided into training, validation, and test sets in a 7:1:2 ratio, with 70% of the data randomly selected for training, 10% for validation, and the remaining 20% ​​for testing. All datasets were uniformly normalized using the min-max normalization method.

[0055] Based on the above conditions, the DHEM model proposed in this invention has been implemented, and experimental results demonstrate the effectiveness of the method. The specific implementation steps of the DHEM model are as follows:

[0056] Step 1: Divide the raw network traffic data into periodic traffic and burst traffic. Network traffic data values ​​higher than 99% are defined as burst traffic and quantified as the positive residual between the raw traffic and a linear interpolation baseline established using the remaining non-burst data points. Subtracting the periodic traffic portion from the raw network traffic data yields the burst traffic data.

[0057] Let x = {x1, x2, ..., xn} N Let} represent the original network traffic data, and let Tr be the 99th percentile of the traffic data. Define the set of non-burst points S = {x | x < Tr}. Obtain the periodic traffic component x by performing linear interpolation on the points in the set of non-burst points S. p :

[0058] x p =L(S), (17)

[0059] Where L(·) represents a linear interpolation function. By subtracting the periodic portion from the original flow data, the burst flow data x is obtained. b :

[0060] x b =xx p (18)

[0061] Step 2: Use a CDDPM model to train and generate periodic network traffic data; such as Figure 2 As shown, the CDDPM model achieves controllable data synthesis by introducing external conditional information to guide the generation process. Its forward process is mathematically identical to the standard denoising diffusion model. However, its backward process utilizes conditional data in each iteration step, making the model's output not only similar to the training data but also conforming to specified conditions.

[0062] (1) Forward Process: Given periodic flow data x0~q(x), the forward diffusion process of the CDDPM model can be described as a Markov chain, which progressively adds Gaussian noise to the data x0. The forward process can be expressed as:

[0063]

[0064] Where T represents the total number of diffusion steps, β t ∈(0,1) represents the noise variance introduced at each step, whose value increases with t, i.e., β t <β t+1 N represents the normal distribution, and I represents the identity matrix.

[0065] To model any step t directly using x0 without iterative computation, the data x at any step t... t It can be represented as:

[0066]

[0067] Where, α t =1-β t ,

[0068] (2) Reverse Process: The reverse denoising process of the CDDPM model constitutes a reverse Markov chain structure, which consists of conditional information. The process guides and gradually recovers the original data distribution x0 from Gaussian noise. A parameterized denoising network is trained to approximate the true reverse process by learning to predict the noise component at each step. Therefore, the reverse process of the CDDPM model can be calculated as follows:

[0069]

[0070]

[0071] Where θ d This represents the parameters of the denoising network. The mean of the predictions can be calculated by maximizing the lower bound of the log-likelihood function. and variance

[0072]

[0073] The denoising network model is trained using the following objective function:

[0074]

[0075] in, This represents the noise estimated by the denoising network.

[0076] The CDDPM model uses multiple residual layers as its denoising network. To reduce the risk of overfitting and enhance the controllability of CDDPM model generation, a classifier-free guidance method is used during training. This involves replacing conditional information with zero vectors with a certain probability p during training. In the backpropagation process, the final noise prediction value... Calculated as conditional output and unconditional output The linear combination of the denoising network in the CDDPM model, controlled by an adjustable guiding scale w, is calculated as follows:

[0077]

[0078] Simultaneously, a fusion module based on a cross-attention mechanism is employed to extract the implicit correlation between guidance information and network traffic; latent vector x t Used for querying Q t Conditional information is also used as key K t Sum V t , can be represented as:

[0079]

[0080] Among them W Q W K W V These are the weight matrices; the output of the cross-attention mechanism is calculated as follows:

[0081]

[0082] The denoising network first calculates a cross-attention score between the noisy traffic features and the conditional embedding, thereby achieving feature fusion. These scores are then fed into a feedforward network to output modulated features. This mechanism allows the denoising network to selectively focus on the most relevant parts of the conditional signal when extracting noisy traffic features.

[0083] Step 3: Use an HM model to train and generate bursty network traffic data. For example... Figure 3 As shown, the HM model consists of two independent decision-making stages. First, a binary model is used to distinguish between zero and positive values. Then, for positive outcomes, a count distribution truncated at zero is used for modeling. Given a set of covariates c, the HM model can be represented as:

[0084]

[0085] Among them, y t The observed value at time t, this probability mass function f(y) tλ(c) describes the distribution of burst traffic events, where λ(c) is the parameter of the zero-cutoff distribution and is defined as a function of the covariate c.

[0086] Indicator Function The method used to indicate whether a surge occurs at time t is as follows:

[0087]

[0088] Based on this, we can calculate y under covariate c. t The probability distribution is shown below:

[0089]

[0090] Non-occurrence probability π t (c)=P(y t =0|c) by classification neural network Predict, where θ c This represents the parameters of the classification network. For positive burst amplitude values, the mean parameter λ(c) of the zero-cutoff distribution is given by the regression neural network. Predict, where θ r This represents the parameters of the regression network. The model is trained by minimizing the following overall loss function:

[0091]

[0092] The first two terms constitute the binary cross-entropy loss, used to optimize the classification network. The parameters are used to predict the probability π that the sudden event will not occur. t The third term is the negative log-likelihood loss with zero truncation, used to optimize the regression network. To learn effective strategies for predicting the magnitude of positive burst traffic.

[0093] Step 4: The final network traffic data is obtained by adding the outputs of the CDDPM model and the HM model; the complete network traffic data integration process includes two generation stages. First, Gaussian noise and guiding conditions are input into the parameter θ. d In the trained denoising network, periodic traffic is generated. Then, the parameters are θ. c and θ r The trained classification and regression networks synthesize burst traffic using conditional information. The final traffic data output is defined as the sum of the periodic traffic component and the burst traffic.

[0094] The performance of the network traffic generation method based on the conditional diffusion model and the barrier model proposed in this invention has been verified by experimental results. (See attached image) Figure 4 and attached Figure 5The performance comparison between the network traffic generated by the method proposed in this invention and real traffic data in the time and frequency domains is presented. The results show that the network traffic data generated by the network traffic model proposed in this invention has high fidelity.

[0095] The contents not described in detail in this application are existing technologies known to those skilled in the art.

Claims

1. A network traffic generation method based on a conditional diffusion model and a barrier model, comprising the following steps: Step 1: Divide the raw network traffic data into periodic traffic and burst traffic; define the network traffic data values ​​higher than 99% as burst traffic and quantify them as the positive residual between the raw traffic and the linear interpolation baseline established using the remaining non-burst data points; subtract the periodic traffic portion from the raw network traffic data to obtain the burst traffic data; Let x = {x1, x2, ..., xn} N Let} represent the original network traffic data, and let Tr be the 99th quantile of the traffic data. Define the set of non-burst points S = {x | x < Tr}. Obtain the periodic traffic component x by performing linear interpolation on the points in the set of non-burst points S. p : x p =L(S), (1) Where L(·) represents a linear interpolation function; by subtracting the periodic portion from the original flow data, the burst flow data x is obtained. b : x b =x-x p ; (2) Step 2: A Conditional Denoising Diffusion Model (CDDPM) is used to train and generate periodic network traffic data; CDDPM guides the generation process by introducing external conditional information, thereby achieving controllable data synthesis. Its forward process is mathematically identical to the standard denoising diffusion model; however, its backward process utilizes conditional data in each iteration step, which makes the model's output not only similar to the training data, but also conform to the specified conditions. (1) Forward process: Let the periodic flow data be x0~q(x). The forward diffusion process of CDDPM can be described as a Markov chain, which gradually adds Gaussian noise to the data x0; the forward process can be expressed as: in, T represents the total number of diffusion steps, β t ∈(0,1) represents the noise variance introduced at each step, whose value increases with t, i.e., β t <β t+1 N represents the normal distribution, and I represents the identity matrix; To model any step t directly using x0 without iterative computation, the data x at any step t... t It can be represented as: Among them, a t =1-b t , (2) Reverse Process: The reverse denoising process of CDDPM constitutes a reverse Markov chain structure, which consists of conditional information. The original data distribution x0 is gradually recovered from Gaussian noise; a parameterized denoising network is trained to approximate the true reverse process by learning to predict the noise component at each step; therefore, the reverse process of CDDPM can be calculated as follows: Where θ d The parameters of the denoising network are represented; the mean of the predictions can be calculated by maximizing the lower bound of the log-likelihood function. and variance The denoising network model is trained using the following objective function: in, This represents the noise estimated by the denoising network; The CDDPM model uses multiple residual layers as its denoising network. To reduce the risk of overfitting and enhance the controllability of CDDPM generation, a classifier-free guidance method is used during training. This involves replacing conditional information with zero vectors with a certain probability p during training. In the backpropagation process, the final noise prediction value... Calculated as conditional output and unconditional output The linear combination of the given values, controlled by an adjustable guiding scale w, is used to calculate the combined output of the denoising network in CDDPM as follows: Simultaneously, a fusion module based on a cross-attention mechanism is employed to extract the implicit correlation between guidance information and network traffic; latent vector x t Used for querying Q t Conditional information c i Also used as key K t Sum V t , can be represented as: Q t =W Q x t ,K t =W K c i ,V t =W V c i , (11) Among them W Q W K W V These are the weight matrices; the output of the cross-attention mechanism is calculated as follows: The denoising network first calculates the cross-attention score between the noisy traffic features and the conditional embedding to achieve feature fusion; then, these scores are fed into the feedforward network to output the modulated features; this mechanism enables the denoising network to selectively focus on the most relevant parts of the conditional signal when extracting the noisy traffic features. Step 3: Train and generate bursty network traffic data using a fence model; the fence model consists of two independent decision-making phases; first, a binary model is used to distinguish between zero and positive values; then, for positive results, a count distribution truncated at zero is used for modeling; given a set of covariates c, the fence model can be represented as: Among them, y t The observed value at time t, this probability mass function f(y) t λ(c) describes the distribution of burst traffic events, where λ(c) is the parameter of the zero-cutoff distribution and is defined as a function of the covariate c. Indicator Function The method used to indicate whether a surge occurs at time t is as follows: Based on this, we can calculate y under covariate c. t The probability distribution is shown below: Non-occurrence probability π t (c)=P(y t =0|c) by classification neural network Predict, where θ c The parameters of the classification network are represented; for positive burst amplitude values, the mean parameter λ(c) of the zero-cutoff distribution is given by the regression neural network. Predict, where θ r represents the parameters of the regression network; the model is trained by minimizing the following overall loss function: The first two terms constitute the binary cross-entropy loss, used to optimize the classification network. The parameters are used to predict the probability π that the sudden event will not occur. t The third term is the negative log-likelihood loss with zero truncation, used to optimize the regression network. To learn effective strategies for predicting the magnitude of positive burst flows; Step 4: The final network traffic data is obtained by adding the outputs of the conditional denoising diffusion model and the fence model; the complete network traffic data integration process includes two generation stages; first, Gaussian noise and guiding conditions are input into the parameter θ. d In the trained denoising network, periodic traffic is generated; subsequently, the parameters are θ. c and θ r The trained classification and regression networks synthesize burst traffic using conditional information; the final traffic data output is defined as the sum of periodic traffic components and burst traffic.