Situational awareness-based network vulnerability defense methods and systems

By collecting multimodal traffic characteristics, using a fusion twin network model to identify abnormal traffic behavior, and combining resource concealment and path collapse rules for dynamic strategy adjustment, the problem of rigid network defense strategies in existing technologies is solved, and effective responses to unknown vulnerabilities and zero-day attacks are achieved, along with an adaptive improvement in protection strategies.

CN121664565BActive Publication Date: 2026-06-30GUANGZHOU ELECTRIC POWER COMM NETWORK LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
GUANGZHOU ELECTRIC POWER COMM NETWORK LTD
Filing Date
2026-01-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing network vulnerability defense technologies lack a deep understanding of the overall network operating status and the ability to perform multi-dimensional correlation analysis, making it difficult to identify slow attacks and advanced persistent threats. Furthermore, defense strategies cannot be dynamically adjusted, resulting in a decrease in protection effectiveness over time.

Method used

By collecting multimodal traffic characteristics, using a fusion twin network model to identify abnormal traffic behavior patterns, constructing an access control policy matrix, and optimizing and adjusting it through resource concealment rules and path collapse rules, and combining a game theory evaluation model for dynamic policy adjustment.

Benefits of technology

It effectively addresses unknown vulnerabilities and zero-day attacks, improves the adaptability of network protection and the accuracy of defense strategies, and avoids the lag and limitations of traditional defense strategies.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121664565B_ABST
    Figure CN121664565B_ABST
Patent Text Reader

Abstract

This application relates to a situational awareness-based network vulnerability defense method and system, comprising the following steps: extracting multimodal traffic features and inputting them into a fused twin network model, outputting security event information; constructing a preliminary control policy matrix based on the security event information, and optimizing and adjusting it to generate an access control policy matrix; converting the access control policy matrix into a network control instruction set and sending it to each execution device; monitoring the network situation of the execution devices and evaluating them using a game theory evaluation model, and dynamically adjusting the optimization parameters based on the evaluation results; in summary, this application achieves abnormal traffic behavior pattern identification through a fused twin network model, and effectively solves the problems of static policy rigidity and low resource scheduling efficiency in traditional defense technologies by combining multi-objective optimization and dynamic policy adjustment mechanisms, thereby achieving accurate identification of abnormal traffic behavior patterns and optimizing resource scheduling configuration, and improving the adaptability of network protection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the technical field of network security, and in particular to a situational awareness-based network vulnerability defense method and system. Background Technology

[0002] Currently, mainstream network vulnerability defense technologies primarily rely on signature-based matching and static policy rules. These technologies utilize intrusion detection systems, firewalls, and other security facilities to identify and intercept security threats by comparing network traffic based on signatures and filtering it according to predefined rules. However, this defense model, which relies on prior knowledge, exhibits significant limitations when facing unknown vulnerabilities, zero-day attacks, and encrypted traffic, with its protective effectiveness lagging far behind the evolution of new threats.

[0003] In practical applications, existing technologies often lack a deep understanding of the overall network operating status and the ability to perform multi-dimensional correlation analysis. Security devices often operate independently, and the alarm information they generate is difficult to form an effective threat chain, making it difficult to accurately identify low-speed attacks and advanced persistent threats. In addition, the response mechanisms in existing technologies mostly adopt statically configured blocking strategies, which are difficult to dynamically adjust protection measures according to the real-time threat situation. This not only results in insufficient accuracy in handling the situation, but may also cause unnecessary interference to normal business operations.

[0004] More importantly, existing technologies generally lack adaptive optimization capabilities in their response mechanisms. Once deployed, their defense strategies are difficult to dynamically adjust according to changes in attack methods and protection effectiveness, making it impossible for the defense system to continuously evolve and ultimately causing its protection effectiveness to decrease over time. Summary of the Invention

[0005] To address the aforementioned shortcomings, this application provides a situational awareness-based network vulnerability defense method and system.

[0006] The above-mentioned objective of this application is achieved through the following technical solution:

[0007] A situational awareness-based network vulnerability defense method includes the following steps:

[0008] Raw data from the communication link is collected, and multimodal traffic features are extracted. These multimodal traffic features include protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features.

[0009] The extracted multimodal traffic features are input into a pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat levels.

[0010] A preliminary control policy matrix is ​​constructed based on security event information, and the preliminary control policy matrix is ​​optimized and adjusted by pre-set resource concealment rules and path collapse rules to generate an access control policy matrix.

[0011] The generated access control policy matrix is ​​converted into a network control instruction set and sent to each execution device;

[0012] Network situation monitoring is performed on the execution devices that receive network control command sets. The execution effect of the execution devices is quantitatively evaluated by a pre-trained game theory evaluation model, and the parameters of resource concealment rules and path collapse rules are dynamically adjusted based on the quantitative evaluation results.

[0013] The second objective of this invention is achieved through the following technical solution:

[0014] A situational awareness-based network vulnerability defense system, comprising:

[0015] The feature extraction module is used to collect raw data in the communication link and extract multimodal traffic features, including protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features.

[0016] The feature input module is used to input the extracted multimodal traffic features into the pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat level.

[0017] The matrix generation module is used to construct a preliminary control policy matrix based on security event information, and to optimize and adjust the preliminary control policy matrix through pre-set resource concealment rules and path collapse rules to generate an access control policy matrix.

[0018] The instruction generation module is used to convert the generated access control policy matrix into a network control instruction set and send it to each execution device;

[0019] The parameter adjustment module is used to monitor the network situation of the execution device that receives the network control command set. It uses a pre-trained game theory evaluation model to quantitatively evaluate the execution effect of the execution device and dynamically adjusts the parameters of the resource concealment rule and path collapse rule based on the quantitative evaluation results.

[0020] In summary, the situational awareness-based network vulnerability defense method and system provided in this application achieves abnormal traffic behavior pattern identification by integrating a twin network model. Combined with multi-objective optimization and dynamic policy adjustment mechanisms, it can effectively solve the problems of static policy rigidity and low resource scheduling efficiency in traditional defense technologies. It enables dynamic adjustment of defense strategies, accurate identification of abnormal traffic behavior patterns, and optimized resource scheduling configuration, thereby improving the adaptability of network protection. Attached Figure Description

[0021] Figure 1 This is a flowchart of an embodiment of a situational awareness-based network vulnerability defense method according to this application;

[0022] Figure 2 This is a flowchart of step S10 in an embodiment of a situational awareness network vulnerability defense method of this application;

[0023] Figure 3 This is a flowchart of step S20 in an embodiment of a situational awareness network vulnerability defense method of this application. Detailed Implementation

[0024] The following is in conjunction with the appendix Figures 1-3 This application will be described in further detail.

[0025] In one embodiment, such as Figure 1 As shown, this application discloses a network vulnerability defense method based on situational awareness, which specifically includes the following steps:

[0026] S10: Collect raw data from the communication link and extract multimodal traffic features, including protocol behavior sequence features, load entropy distribution features, communication periodicity features, and connection topology features;

[0027] In this embodiment, situational awareness refers to monitoring, analyzing, and understanding various elements in the network environment to form an overall understanding of the current network security status and predict potential future security events. The situational awareness process may involve data collection, processing, analysis, visualization, and decision support, providing a macro-level perspective and decision-making basis for network defense. Multimodal traffic characteristics refer to a set of traffic attributes of various types and dimensions extracted from the raw data of network communication links, used to characterize network traffic behavior patterns from different perspectives. Multimodal traffic characteristics include protocol behavior sequence characteristics, payload entropy distribution characteristics, communication periodicity characteristics, and connection topology characteristics. Characteristics include: Protocol behavior sequence characteristics, which describe the interaction order and state transitions of communicating parties according to protocol specifications, such as whether the TCP three-way handshake is complete or whether there is an abnormal reset; Load entropy distribution characteristics, which measure the randomness of data packet load content, where high entropy values ​​may indicate that the data is encrypted or compressed, while low entropy values ​​may indicate plaintext or executable code; Communication periodicity characteristics, which reflect the regularity of traffic over time, such as heartbeat packets, periodic queries, and periodic data reporting; Connection topology characteristics, which use graph theory algorithms to analyze the connection behavior characteristics of hosts in the network, such as how many other hosts a host is communicating with, used for discovery scanning and lateral movement.

[0028] Specifically, the process begins by collecting raw data from the communication link and extracting multimodal traffic features. This raw data can be raw data packets captured by a network interface card or flow records from network devices. Multimodal traffic feature extraction can be achieved through different dimensions of analysis of the raw data. For example, parsing the protocol fields of data packets can identify their protocol types and state transitions, thus forming protocol behavior sequence features. Alternatively, statistical analysis of the payload content of data packets can be performed to calculate their information entropy value, characterizing the payload entropy distribution. Furthermore, analyzing time-series traffic data can identify periodic patterns, forming communication periodic features. Alternatively, analyzing the connection relationships and performance indicators of network devices can construct a network topology map and calculate the connection attributes between nodes, forming connection topology features.

[0029] S20: Input the extracted multimodal traffic features into the pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat level;

[0030] In this embodiment, the fusion twin network model refers to a pre-trained deep learning model architecture. Specifically, its core is to learn the similarities or differences between input data by utilizing the structure of the twin network. By fusing features from different modalities, the fusion twin network model can more comprehensively understand the data and perform pattern recognition, such as the identification of abnormal traffic behavior patterns. The quintuple information is the basic information used to uniquely identify network connections, including source IP address, destination IP address, source port, destination port, and protocol type. Security event information refers to the detailed data about a network security event generated and recorded when it occurs. The security event information typically includes the type of event, time, entities involved, threat level, and other relevant contextual information such as the quintuple information.

[0031] Specifically, the extracted multimodal traffic features are input into a pre-trained fusion twin network model. This fusion twin network model is a deep learning model trained on a large amount of normal and abnormal traffic data. Its working principle is to identify abnormal behavior by learning the similarity or difference between different traffic patterns. For example, it can receive multimodal traffic features as input and compare them with known normal behavior patterns stored in the model. When the similarity between the input features and the normal behavior patterns is lower than the corresponding threshold, it is judged as abnormal traffic. After identifying the abnormal traffic behavior pattern, the fusion twin network model will output security event information containing five-tuple information and threat level. The five-tuple information usually includes source IP address, destination IP address, source port, destination port, and protocol type, which can uniquely identify a corresponding network connection. The threat level indicates the potential harm of the abnormal behavior.

[0032] S30: Construct a preliminary control policy matrix based on security event information, and optimize and adjust the preliminary control policy matrix through pre-set resource concealment rules and path collapse rules to generate an access control policy matrix;

[0033] In this embodiment, the preliminary control policy matrix refers to the set of access control rules or defense measures initially generated based on the identified security event information. Since it is initially generated, the preliminary control policy matrix may contain redundancy or not be fully adapted to the current network situation. Resource concealment rules specify how and by what technical means to hide or obscure the true existence or accessibility of specific resources in the network when a potential threat is detected, thereby increasing the difficulty for attackers to discover and exploit these resources. Path collapse rules specify how to restrict or cut off specific network communication paths that attackers may exploit by adjusting network routing, firewall rules, or virtual network configurations when a potential threat is detected, thereby preventing the further spread of the attack or its reaching critical targets. The access control policy matrix, after optimization and adjustment, refers to the set of rules used to specify the access control decisions executed by network devices. The access control policy matrix defines which corresponding resources different entities can access, and under what conditions access is permitted.

[0034] Specifically, based on security event information, a preliminary control policy matrix is ​​constructed. This matrix can be a list of access control rules, such as blocking rules for specific source IP addresses or traffic restriction rules for specific ports. These access control rules are initially generated based on the nature and threat level of the security event. To improve the accuracy and effectiveness of the defense strategy, the preliminary control policy matrix is ​​optimized and adjusted using pre-set resource concealment rules and path collapse rules to generate an access control policy matrix. The resource concealment rule specifies that when an attack is detected, the IP address or port of the attacked resource should be dynamically changed to make it invisible to the attacker. The path collapse rule can dynamically adjust network routing based on the analysis of the attack path, directing attack traffic to the honeypot or dropping it directly, thereby cutting off the attack path. Through the resource concealment rules and path collapse rules, the defense strategy can adaptively adjust according to the real-time network situation, rather than maintaining static defense measures.

[0035] S40: Convert the generated access control policy matrix into a network control instruction set and send it to each execution device;

[0036] In this embodiment, the network control instruction set refers to the conversion of the access control policy matrix into a sequence of commands that can be directly understood and executed by network devices. Each instruction in the network control instruction set can be used to configure firewall rules, routing table entries, security group policies, etc., to control network traffic. Furthermore, the core elements of the network control instruction set include: a matching field, an action field, priority, and expiration. The matching field defines the conditions under which a rule takes effect, typically based on packet header information. The action field defines the operation that the device should perform when a packet matches the condition. Priority allows the device to determine the execution order or whether to execute instructions based on their priority when multiple instructions match a packet simultaneously. Expiration allows instructions to be set to have a lifespan, after which they automatically expire. This is applicable to temporary security policies. Furthermore, the matching field includes the classic 5-tuple and other matching items. The classic 5-tuple includes the source IP address, destination IP address, source port, destination port, and protocol type. Other matching items include the input interface, output interface, TCP flags, DSCP value, etc. The action field includes allow, deny, redirect, rate limit, and modify. Allow specifically allows the packet to pass; deny specifically sends a deny message; redirect specifically forwards the packet to a specified destination; rate limit specifically limits the rate of the matched traffic, allowing it to pass but not exceeding a set bandwidth threshold; modify specifically modifies certain fields of the packet. The executing device refers to the physical or virtual device that receives and implements the issued network control command set.

[0037] For example, the execution device may mainly include:

[0038] Network core switching equipment: such as routers and core switches, which are the core of network traffic forwarding, can execute instructions in their ACLs or flow tables to achieve network layer access control, redirection, and rate limiting; Professional network security equipment: such as next-generation firewalls, which can execute corresponding policies based on application layer content, user identity, threat intelligence, etc., such as blocking specific applications and intercepting intrusion behavior; Software-defined network switches: In the SDN architecture, they are data plane units that execute forwarding and control policies through OpenFlow or P4 flow tables issued by the controller; Load balancers: They can redirect suspected malicious traffic to security detection or scrubbing clusters according to policies; Intrusion prevention systems and unified threat management equipment: They can realize real-time detection and blocking of attack traffic; Host security agents and host firewalls: They are deployed on terminal devices to execute fine-grained access control for the local machine; Smart network interface cards (NICs): They achieve high-speed execution of network policies inside the server through hardware offloading mechanisms.

[0039] Specifically, the generated access control policy matrix is ​​converted into a network control instruction set and sent to each execution device. The network control instruction set consists of specific configuration commands for different network devices. For example, for a firewall, the network control instruction set may contain commands to add or modify access control lists; for a router, the network control instruction set may contain commands to modify routing table entries. The network control instruction set is sent to the corresponding execution devices through standard protocols to achieve rapid deployment and immediate effect of the policies.

[0040] S50: Monitors the network situation of the execution device that receives the network control command set, quantifies the execution effect of the execution device through a pre-trained game theory evaluation model, and dynamically adjusts the parameters of the resource concealment rule and path collapse rule based on the quantitative evaluation results.

[0041] In this embodiment, the game theory evaluation model is a pre-trained mathematical model based on game theory principles, used to analyze and evaluate the potential gains and risks of different strategy choices in an environment with adversarial behavior. The game theory evaluation model can quantitatively evaluate the effectiveness of defense strategies and provide a theoretical basis for the optimization and adjustment of strategies.

[0042] Specifically, network situation monitoring is performed on the execution devices receiving network control command sets to verify the actual effectiveness of deployed strategies. A pre-trained game theory evaluation model is used to quantitatively evaluate the execution performance of the devices. This model simulates the adversarial process between attackers and defenders and quantitatively evaluates the probability of an attacker successfully exploiting a vulnerability and the benefit of a defender successfully intercepting the attack under the current strategy. Specifically, it comprehensively considers multiple factors, such as attack interception success rate, service false interception rate, and real-time performance indicators, to calculate the corresponding comprehensive benefit value. Based on the quantitative evaluation results, the parameters of resource concealment rules and path collapse rules are dynamically adjusted. For example, if the evaluation results show a low attack interception success rate, the trigger threshold or intensity of the resource concealment rules can be adjusted to more actively hide resources; if the service false interception rate is high, the probability or scope of the path collapse rules can be adjusted to more accurately restrict traffic.

[0043] Through the above technical solution, this embodiment combines multimodal traffic feature extraction with abnormal behavior identification based on a fused twin network model to achieve early detection and identification of network threats. Compared with traditional defense methods that rely on single features or static signatures, this method can characterize network traffic behavior from multiple dimensions, thereby effectively responding to unknown vulnerabilities and zero-day attacks. Furthermore, this method constructs a preliminary control policy matrix based on security event information and optimizes it through resource concealment rules and path collapse rules to generate an access control policy matrix. This allows the defense strategy to adaptively adjust according to the real-time network situation, avoiding the lag and limitations of traditional static strategies in the face of complex and ever-changing attacks. In addition, this method monitors the network situation of the execution device and uses a game theory evaluation model to quantitatively evaluate the execution effect, thereby achieving continuous optimization of the defense strategy. At the same time, by dynamically adjusting the parameters of resource concealment rules and path collapse rules based on the quantitative evaluation results, a closed-loop adaptive defense system is formed, overcoming the limitations of existing static defense strategies in adapting to attack changes and optimizing protection effects.

[0044] In one embodiment, the raw data includes data packets, network flow records, network node connection information, and network device performance metrics, such as... Figure 2 As shown, step S10 includes:

[0045] S11: Collect data packets in the communication link, perform protocol parsing on the data packets, and extract the protocol state transition sequence and message interaction timing as protocol behavior sequence features;

[0046] In this embodiment, raw data refers to unprocessed initial information obtained from the network. This raw data includes data packets, network flow records, network node connection information, and network device performance metrics. Data packets are the basic unit of network communication and are typically the direct source for protocol behavior analysis and payload content analysis. Also known as network packets or frames, a data packet consists of a header and a payload. The header contains control information such as source IP address, destination IP address, port number, protocol type, sequence number, and flags. The payload contains the actual application layer data to be transmitted. Network flow records, such as NetFlow or IPFIX data, record metadata about network communication, such as source IP address, destination IP address, port number, protocol type, number of bytes transmitted, and number of data packets. Network node connection information describes the physical or logical connections between devices in the network and is fundamental data for constructing the network topology. Network device performance metrics reflect the operating status and resource utilization of network devices, such as CPU utilization, memory utilization, interface bandwidth utilization, packet loss rate, and latency. These metrics help assess network health and identify potential performance bottlenecks or anomalies.

[0047] In this embodiment, a communication link refers to a physical or logical channel for data transmission between two nodes in a network. It represents the path data follows from its source to its destination. A physical link refers to a real, existing hardware connection, such as fiber optic cables, network cables, or radio waves, focusing on the physical and data link layers. A logical link, on the other hand, refers to a virtual communication channel established over a physical medium through protocols. For example, a network cable acting as a physical link can simultaneously carry traffic from multiple VLANs, i.e., multiple logical links. A TCP connection acting as a logical link is established over an IP network acting as a logical network. Protocol parsing refers to the process of performing structured analysis on raw data packets according to known network protocol specifications, identifying and extracting protocol fields at each layer. Protocol state transition sequence refers to information describing the sequence of state changes experienced by both communicating parties during the establishment, maintenance, and termination of connections, according to protocol specifications. Message interaction timing refers to analyzing the pattern characteristics of the time intervals and frequencies of data packet arrivals. Protocol behavior sequence characteristics refer to the state transition sequence formed by parsing the protocol interaction process in data packets. Specifically, this can be implemented using a TCP three-way handshake state machine or an HTTP request-response sequence to identify abnormal behaviors that do not conform to standard protocol specifications.

[0048] S12: Perform deep packet inspection on the data packet and extract the application layer payload content. Use the sliding window sampling method to segment the payload content and calculate the Shannon entropy value to form the payload entropy value distribution characteristics.

[0049] In this embodiment, deep packet inspection refers to a technique for analyzing the payload portion of data packets. Through deep packet inspection, the application layer protocol type can be identified, and the application layer payload content can be extracted. The application layer payload content refers to the actual data portion of the data packet, which typically includes user data or application protocol data. Sliding window sampling is a data processing technique that defines a fixed-size window and slides it across the data stream with a certain step size to process the data within the window. Sliding window sampling can capture local features of the data stream and is suitable for segmented analysis of continuous data. Shannon entropy is an indicator in information theory used to measure the uncertainty or randomness of information. The higher the value, the more random or complex the data. In the payload content analysis of this embodiment, Shannon entropy can reflect the degree of disorder, encryption, or whether the payload data contains a specific pattern. The payload entropy distribution characteristic refers to the distribution characteristic of the information entropy value calculated by segmenting the application layer data, which is used to detect encrypted traffic or data obfuscation behavior.

[0050] S13: Construct a traffic time series based on network flow records, and analyze the traffic time series using fast Fourier transform to extract communication periodic characteristics;

[0051] In this embodiment, network flow records refer to a coarser-grained data format than data packets. Instead of recording the entire contents of each data packet, network flow records summarize and statistically analyze data packet communications with the same five-tuple within a specific time window. The recorded information typically includes start time, end time, total number of bytes transmitted, total number of data packets, TCP flags, etc. Traffic time series refers to a set of traffic data points arranged in chronological order, such as the number of bytes per second, minute, or hour, the number of data packets, or the number of connections. By aggregating and statistically analyzing network flow records, a traffic time series reflecting the dynamic changes in network traffic can be constructed. Fast Fourier Transform (FFT) is an algorithm used to convert time-domain signals into frequency-domain signals. By analyzing the spectrum of the frequency-domain signal, periodic components in the traffic data can be identified; in this embodiment, this refers to communication periodic characteristics, such as daily, weekly, or monthly traffic peaks and troughs.

[0052] S14: Construct a topology graph based on network node connection information and network device performance indicators, and calculate the degree centrality and betweenness centrality of each node in the topology graph as connection topology features.

[0053] In this embodiment, network node connection information refers to topology data describing how devices in the network are interconnected. It defines the physical or logical link relationships between nodes and typically comes from network management protocols, LLDP protocols, or network administrator configuration information. Network device performance indicators refer to quantitative data reflecting the operating status of network devices, mainly including CPU utilization, memory utilization, port bandwidth utilization, and number of connections. A topology graph is an abstract representation of the network structure, where nodes represent devices or hosts in the network, and edges represent the connection relationships between these devices. Degree centrality is an indicator that measures the number of other nodes directly connected to a node in the topology graph, reflecting the node's activity and direct influence. Betweenness centrality is an indicator that measures the number of times a node acts as a mediator on the shortest path between other nodes in the topology graph. It calculates the proportion of all shortest paths in the network structure that pass through the node, reflecting the node's importance in information flow or control flow. Connection topology features are indicators reflecting the importance of network node connection relationships and are used to discover central nodes with abnormal connections.

[0054] Specifically, raw data such as data packets, network flow records, network node connection information, and network device performance indicators are collected. For different types of raw data, corresponding extraction methods are employed: protocol parsing is performed on data packets to obtain protocol state transition sequences and packet interaction timelines, thereby characterizing protocol behavior; deep packet inspection is performed on data packets, combined with sliding window sampling and Shannon entropy calculation, to form payload entropy distribution characteristics, thus indicating anomalies in payload content; a traffic time series is constructed based on network flow records and a Fast Fourier Transform is applied to extract communication periodic characteristics, thereby identifying deviations in traffic patterns; finally, a topology graph is constructed by combining network node connection information and network device performance indicators, and degree centrality and betweenness centrality are calculated to obtain connection topology characteristics, thereby reflecting changes in network structure and node importance.

[0055] Through the above technical solution, this application can extract multimodal traffic features from multi-source heterogeneous raw data. These features not only cover the dynamic changes in protocol behavior, the attributes of payload content, and the periodic patterns of communication modes, but also reflect the importance of network topology and nodes. Using the extracted multimodal traffic features as subsequent input can improve the accuracy and robustness of anomaly detection in subsequent steps, effectively reducing false alarm rate and false negative rate. At the same time, by performing targeted processing on different types of raw data, the limitations of a single data source are avoided, and multi-level perception of network situation is achieved.

[0056] In one embodiment, the fused Siamese network model includes a feature encoding layer, an attention fusion layer, a similarity calculation layer, an intent recognition layer, and an output generation layer, such as... Figure 3 As shown, step S20 includes:

[0057] S21: The feature coding layer encodes the received multimodal traffic features to generate a multimodal representation vector;

[0058] In this embodiment, the fusion Siamese network model refers to a deep learning model used to learn the similarity or difference between input data through two or more sub-networks sharing weights. In this embodiment, the fusion Siamese network model is used to learn normal traffic behavior patterns and identify anomalies by comparing the similarity between the traffic to be detected and known normal patterns. Its role is to map complex network traffic features to a low-dimensional space and perform anomaly detection in this space. The fusion Siamese network model includes a feature encoding layer, an attention fusion layer, a similarity calculation layer, an intent recognition layer, and an output generation layer. The feature encoding layer is used to convert the received multimodal traffic features into a unified numerical representation, i.e., a multimodal representation vector, to eliminate the heterogeneity between different modal features. For example, the feature encoding layer can use a convolutional neural network to encode sequence features, or a multilayer perceptron to encode numerical features. In addition, the feature encoding layer can also use models such as recurrent neural networks or Transformers to process time-series or sequential data to capture their inherent temporal dependencies. The multimodal representation vector is a numerical representation of different modal traffic features after processing by the feature encoding layer.

[0059] S22: The attention fusion layer performs cross-modal correlation fusion on the multimodal representation vectors to generate feature representation vectors;

[0060] In this embodiment, the attention fusion layer is used to perform cross-modal correlation fusion of multimodal representation vectors to generate more representative feature representation vectors. The attention fusion layer introduces an attention mechanism to dynamically evaluate the importance of different modal features for abnormal behavior identification and assigns them corresponding weights, thereby highlighting key information and suppressing redundant or noisy information. For example, the attention fusion layer can use a self-attention mechanism to capture the importance of different parts within the same modality, or a cross-attention mechanism to model the interdependencies between different modalities. The feature representation vector is a comprehensive vector that integrates multimodal information after processing by the attention fusion layer, containing a multifaceted description of the current traffic behavior.

[0061] S23: The similarity calculation layer calculates the multi-dimensional similarity distance between the feature representation vector and the baseline vector in the preset behavioral feature library;

[0062] In this embodiment, the similarity calculation layer is used to calculate the multi-dimensional similarity distance between the feature representation vector and the baseline vector in the preset behavioral feature library. By comparing the feature representation vector of the traffic to be detected with the baseline vector of known normal behavior patterns, the normality or abnormality of the current traffic can be quantified. For example, the similarity calculation layer can use Euclidean distance, cosine similarity, or Mahalanobis distance to calculate the distance between vectors. In addition, the similarity calculation layer can also use kernel function-based methods, such as radial basis function kernels, to map vectors to a high-dimensional space for similarity calculation. The preset behavioral feature library refers to a database that stores a large number of baseline vectors of known normal network behavior patterns. The baseline vectors in the library are obtained by training and learning from historical normal traffic data and represent various behavioral features of the network under normal operating conditions. The baseline vector refers to one or a group of vectors in the preset behavioral feature library, which represents a specific normal behavior pattern. The multi-dimensional similarity distance is the value output by the similarity calculation layer, which quantifies the degree of deviation between the traffic to be detected and the normal behavior pattern. The smaller the multi-dimensional similarity distance, the closer it is to normal behavior; the larger the multi-dimensional similarity distance, the more likely there is an anomaly.

[0063] S24: The intent recognition layer performs attack behavior evolution analysis based on feature representation vectors to identify attack types and attack stages;

[0064] In this embodiment, the intent recognition layer is used to perform attack behavior evolution analysis based on feature representation vectors to identify attack types and attack stages. Its core lies in understanding the nature of abnormal behavior, rather than simply judging whether it is abnormal. By analyzing the patterns contained in the feature representation vectors, the attack methods, targets, and current lifecycle stages of the attack that the attacker may adopt can be inferred. For example, the intent recognition layer can use a classifier to identify attack types and combine it with sequence models to analyze the evolution of attack stages. Attack behavior evolution analysis refers to understanding the progress of an attack from different stages such as initial probing, vulnerability exploitation, privilege escalation, and data theft by mining abnormal traffic characteristics. Attack type refers to identifying specific attack methods, such as DDoS attacks, port scanning, SQL injection, and malware propagation. Attack stage refers to determining the current lifecycle stage of the attack, such as the reconnaissance stage, intrusion stage, control stage, or impact stage.

[0065] S25: When the multi-dimensional similarity distance is lower than the preset similarity threshold, the output layer will determine the feature representation vector as an abnormal vector and generate security event information containing five-tuple information and threat level based on the attack type, attack stage and multi-dimensional similarity distance.

[0066] In this embodiment, the output generation layer is used to determine the feature representation vector as an abnormal vector when the multi-dimensional similarity distance is lower than a preset similarity threshold. Based on the attack type and attack stage output by the intent recognition layer and the multi-dimensional similarity distance output by the similarity calculation layer, it generates security event information containing five-tuple information and threat level. An abnormal vector refers to a feature representation vector that is determined to deviate from the normal behavior pattern. The five-tuple information usually includes the source IP address, destination IP address, source port number, destination port number, and protocol type, which are used to uniquely identify the corresponding network connection. The threat level refers to the quantitative assessment of the severity of the security event, which can usually be divided into different levels such as high, medium, and low, so as to facilitate the management personnel to prioritize and respond. The security event information is the final output structured data, which contains a comprehensive description of the abnormal behavior, including the source, nature, severity, and possible impact of the abnormal behavior.

[0067] Specifically, when the extracted multimodal traffic features are input into the fusion Siamese network model, the feature encoding layer first standardizes and abstracts the multimodal traffic features, transforming them into multimodal representation vectors in a unified format. The attention fusion layer receives the multimodal representation vectors and uses an attention mechanism to perform cross-modal correlation fusion, thereby generating more discriminative feature representation vectors. This effectively overcomes the information loss and noise interference problems that may be caused by simple splicing or average fusion. The similarity calculation layer compares the generated feature representation vectors with the benchmark vectors stored in the preset behavioral feature library to calculate the multidimensional similarity distance, which can intuitively reflect the degree of deviation between the current traffic behavior and the known normal behavior pattern. At the same time, the intent recognition layer further performs attack behavior evolution analysis based on the feature representation vectors to identify the specific attack type and the stage of the attack. When the multidimensional similarity distance obtained by the similarity calculation layer is lower than the preset similarity threshold, the output generation layer determines the current traffic as abnormal. Based on this, and combined with the attack type and attack stage information provided by the intent recognition layer and the similarity distance, it generates and outputs detailed security event information containing five-tuple information and threat level.

[0068] As a specific implementation, the aforementioned fused Siamese network model is implemented in the following way: the feature encoding layer can consist of multiple independent encoders. For example, for protocol behavior sequence features, a Transformer-based encoder can be used to capture their temporal dependencies; for payload entropy distribution features and communication periodicity features, a one-dimensional convolutional neural network can be used for feature extraction; and for connection topology features, a graph neural network can be used for encoding. The output vectors of each of these encoders constitute the multimodal representation vector. The attention fusion layer can employ a multi-head self-attention mechanism, which allows the fused Siamese network model to learn different cross-modal association patterns in different attention heads, thereby generating a comprehensive feature table. The similarity calculation layer can employ metric learning-based methods, such as training a corresponding triplet loss function to ensure that normal samples are close to each other in the feature space, while abnormal samples are far from normal samples, and using Euclidean distance as the calculation method for multi-dimensional similarity distance; the intent recognition layer can be a multi-classifier, such as a classifier based on a long short-term memory network, which receives feature representation vectors and outputs attack type and attack stage; the output generation layer can be a decision logic unit that triggers anomaly detection when the calculated Euclidean distance is less than a preset threshold, and dynamically generates security event information including source IP, destination IP, source port, destination port, protocol, and threat level based on the output of the multi-classifier and the magnitude of the Euclidean distance.

[0069] Furthermore, the parameters of each module layer in the aforementioned fusion twin network model can be set by those skilled in the art according to actual needs, such as the kernel size and number of filters in a one-dimensional convolutional neural network, or the number and dimension of attention heads in a multi-head self-attention mechanism.

[0070] Meanwhile, to enable those skilled in the art to reproduce the training of the fused Siamese network model, the following are exemplary training steps:

[0071] Training data preparation: Input data: Training is performed using a public dataset containing normal traffic and various known attack traffic. The raw data is preprocessed and multimodal traffic features are extracted. Public datasets such as CIC-IDS2017 and UNSW-NB15 are industry-recognized standard datasets for network intrusion detection. Data annotation: Each traffic data is annotated to indicate whether it is abnormal or not, and its attack type and attack stage are further annotated.

[0072] Loss function design: The training process adopts a multi-task learning framework and optimizes the following objectives simultaneously: Contrastive loss: used in the similarity calculation layer to bring normal samples closer to the baseline vector and push away abnormal samples; Binary cross-entropy loss: used for binary classification tasks of abnormal and normal samples; Multi-label cross-entropy loss: used for attack type and attack stage classification in the intent recognition layer; Meanwhile, the total loss function is a weighted sum of the above three losses, and the weight ratios are set according to the actual situation.

[0073] Training process: Using the Adam optimizer and other methods, set the corresponding initial learning rate. After training for a certain number of epochs, the learning rate is reduced to the original preset ratio, and the corresponding batch size is set. When the validation set loss no longer decreases significantly within N consecutive epochs, it is determined to be converged and training is terminated.

[0074] Through the above technical solution, the fusion twin network model of this application, by introducing a feature encoding layer, an attention fusion layer, a similarity calculation layer, an intent recognition layer, and a generation output layer, realizes the processing and analysis of multimodal traffic features. Specifically, the feature encoding layer processes heterogeneous multimodal features and transforms them into a unified multimodal representation vector; the attention fusion layer improves the discriminative power of feature representation through weighted fusion, thereby reducing information redundancy and loss of key information; the similarity calculation layer provides a means to quantify the degree of anomaly; the intent recognition layer further explores the essence of abnormal behavior and identifies specific attack types and attack stages; finally, the generation output layer can output comprehensive security event information containing five-tuple information and threat levels based on all the aforementioned analysis results. Through the above-mentioned layered architecture processing mechanism, the fusion twin network model can more accurately and comprehensively identify complex abnormal traffic behavior patterns, thereby improving the accuracy of anomaly detection and the depth of threat understanding, and has the effect of improving the intelligence and proactivity of network vulnerability defense.

[0075] In one embodiment, step S30 includes:

[0076] S31: Based on the safety time information, match and generate several preliminary control policy items from the predefined policy atom library, and construct a preliminary control policy matrix based on the generated preliminary control policy items;

[0077] In this embodiment, the policy atomic library refers to a predefined set containing all available basic defense actions. Each atomic policy is an indivisible minimum response unit, which can be a simple rule targeting a specific attack type, protocol, port, or source IP address and destination IP address. Each element in the policy atomic library is typically a tuple of "one condition to one action". The preliminary control policy item refers to a specific defense action suggestion to be executed, generated by matching and combining from the policy atomic library for a specific security event. It is usually a complete policy rule, but has not yet undergone global optimization. The preliminary control policy matrix is ​​a policy set composed of multiple basic preliminary control policy items, specifically generated by combining predefined policy templates from the policy atomic library. It is used to cover basic defense needs under different attack scenarios. Typically, its rows correspond to different defense targets, such as different attack paths or assets, and its columns correspond to different policy actions, such as allow, block, redirect, and conceal. Furthermore, the element values ​​in the matrix represent the weight, probability, or priority of the policy action.

[0078] In this embodiment, step S31 aims to select and combine preliminary control policies for the current threat from a predefined policy atom library based on the detected security event information. The matching and generation process of the preliminary control policy items can be implemented by a rule engine or expert system, using the security event information as input to match the rules in the policy atom library. For example, if a DDoS attack is detected, policy atoms such as "restrict traffic from a specific source IP" or "enable traffic scrubbing" may be matched. In addition, a machine learning classifier can be used to train on historical security events and corresponding effective preliminary policies. When a new security event occurs, the machine learning classifier predicts and recommends the corresponding preliminary control policy items based on the characteristics of the security event.

[0079] S32: Construct a directed attack graph based on security event information, wherein the nodes of the directed attack graph are network assets and the edges are potential vulnerability exploitation paths associated with risky flow values;

[0080] In this embodiment, a directed attack graph refers to a topological structure reflecting potential attack paths between network assets. Specifically, it can be constructed using network topology connections and a vulnerability database. Nodes in the directed attack graph represent the state of the network system, such as an asset being compromised or a privilege being granted. Edges represent attack actions, such as exploiting a vulnerability. Furthermore, the weights on the edges of the directed attack graph quantify the threat level of the attack path using risk flow values. Network assets refer to the sum of hardware devices, software services, data resources, and virtual entities that constitute a network information system, possessing logical addresses and capable of independent identification and management. Typically, network assets are the targets of network attacks and the core objects of network security protection. When constructing the directed attack graph... Each node in the graph represents a network asset. Analyzing the connections and dependencies between assets is key to understanding how an attack spreads from one point to another. For example, after compromising asset A, an attacker can use it as a springboard to attack asset B, which is connected to it. The risk flow value quantifies the potential risk an attacker can create by exploiting an edge in a directed attack graph. It is usually calculated based on factors such as vulnerability exploitability, asset value, and the probability of attack success. The higher the risk flow value, the greater the threat of that path. A potential vulnerability exploitation path refers to the sequence of all attack steps an attacker might take to penetrate from an initial access point and eventually reach the target asset by exploiting security vulnerabilities and configuration weaknesses in the network.

[0081] In this embodiment, step S32 aims to gain a deep understanding of the attack paths that attackers may take. The directed attack graph is a tool for visualizing and analyzing network attack paths. It treats network devices, services, data, etc. as network asset nodes and vulnerabilities or configuration errors that attackers may exploit as edges. By combining vulnerability databases and attack graph generation algorithms, it automatically constructs potential attack paths from the attack source to the attack target and calculates the risk flow value of each path. Alternatively, graph database technology can be used to store information such as network assets, vulnerabilities, and attack methods as a graph structure. The directed attack graph can be dynamically generated and updated through graph traversal algorithms and risk assessment models. The risk flow value can be calculated comprehensively based on factors such as the vulnerability's CVSS score, asset value, and attack complexity.

[0082] S33: The initial control strategy matrix is ​​used as the input decision variable. The solution space is defined by the directed attack graph. The optimization objective is set based on the risk flow value. The multi-objective constrained optimization problem is constructed with the pre-set resource concealment rules and path collapse rules as constraints.

[0083] In this embodiment, resource concealment rules refer to constraints that dynamically hide critical network resources. Specifically, this can be achieved by dynamically adjusting firewall rules or routing tables to reduce the probability of attack surface exposure. Path collapse rules refer to constraints that block or redirect potential attack paths. Specifically, this can be achieved by adjusting the access control lists or traffic scheduling policies of network devices to disrupt the connectivity of attack links. The multi-objective constraint optimization problem refers to a mathematical model that aims to minimize risk flow values ​​and maximize defense coverage while simultaneously satisfying resource concealment and path collapse constraints. Specifically, it can be solved using genetic algorithms or particle swarm optimization algorithms to balance defense effectiveness and resource consumption. In this embodiment, the decision variables are the elements in the initial control policy matrix, i.e., the weight of each policy item or whether it is adopted. The optimization process involves adjusting these decision variables. Objective functions are usually multiple and may conflict, for example: Objective 1: minimize the overall risk of the entire attack graph; Objective 2: maximize the availability of critical services. Constraints refer to the limitations derived from resource concealment rules and path collapse rules.

[0084] In this embodiment, step S33 formalizes the optimization process of the preliminary control strategy matrix into a mathematical optimization problem. The activation or deactivation of each strategy item in the preliminary control strategy matrix, or its parameter configuration, can be regarded as a decision variable. The directed attack graph provides all possible attack paths and their risks, constituting the solution space corresponding to the strategy optimization. The optimization objective is to reduce the overall network risk or improve the defense effect. At the same time, resource concealment rules and path collapse rules serve as constraints to ensure that the optimization result meets the expected defense strategy. The decision variables can be binary variables, representing whether each strategy item in the preliminary control strategy matrix is ​​activated. The optimization objective can be set to minimize the sum of risk flow values ​​of all potential vulnerability exploitation paths in the directed attack graph, or to maximize the number of blocked high-risk paths.

[0085] S34: Solve the constructed multi-objective optimization problem to obtain the optimization strategy weight configuration;

[0086] In this embodiment, the optimization strategy weight configuration refers to the solution of a multi-objective constrained optimization problem. It indicates how each strategy in the initial control strategy matrix should adjust its weight or probability in order to achieve multiple defense objectives in the optimal way while satisfying all constraints.

[0087] In this embodiment, step S34 aims to obtain the optimal strategy configuration by solving the problem, so that the optimization objective reaches the best state under the premise of satisfying all constraints. The optimization strategy weight configuration in the step refers to the best combination of the priority, strength or activation state of each strategy item in the initial control strategy matrix. Furthermore, the solution process of the multi-objective optimization problem can adopt heuristic algorithms, such as genetic algorithms, particle swarm optimization algorithms or simulated annealing algorithms. In addition, exact algorithms, such as linear programming, integer programming or mixed integer programming, can also be used.

[0088] S35: Optimize and adjust the initial control policy matrix according to the optimization policy weight configuration to generate an access control policy matrix.

[0089] In this embodiment, the access control policy matrix refers to the set of executable policies generated after optimizing and adjusting the initial control policy matrix. The weights in the access control policy matrix have been adjusted to their optimal state through optimization algorithms and can be directly converted into specific network device configuration instructions, such as ACL rules.

[0090] In this embodiment, step S35 aims to transform the optimization result into a specific security policy. If the optimization policy weight configuration is binary, the corresponding policy item in the preliminary control policy matrix is ​​enabled or disabled directly according to the configuration. If the optimization policy weight configuration is weight or priority, the policy items can be sorted according to the weight, or the specific parameters of the policy items can be adjusted to finally generate the access control policy matrix.

[0091] Specifically, after security incident information is generated, policy items matching the threat level and attack type are selected from the policy atom library, such as traffic filtering, port blocking, or protocol restriction, to form a preliminary control policy matrix. Based on the source IP and destination IP in the 5-tuple information, combined with network topology and vulnerability database, a directed attack graph is constructed. Nodes in the graph represent network assets such as servers and terminals, and edges represent paths with potential vulnerability exploitation. Risk flow values ​​are calculated using vulnerability scores and asset values. The preliminary control policy matrix is ​​used as a decision variable. Within the solution space of the directed attack graph, the optimization objective is to reduce the overall risk flow value. Resource concealment rules and path collapse rules are used as constraints to construct a multi-objective optimization model. After solving the multi-objective optimization problem through optimization algorithms, the weight configuration of each policy item is obtained. For example, specific blocking policies in the path collapse rules are prioritized, or the trigger threshold of resource concealment rules is adjusted. Finally, the preliminary policies are dynamically adjusted according to the optimized policy weight configuration to generate an optimized access control policy matrix.

[0092] Through the above technical solutions, this application constructs a directed attack graph to model and quantify the potential attack paths and their associated risks in the network, thereby achieving a shift from passive response to proactive prediction. Furthermore, by introducing resource concealment rules and path collapse rules, it improves the level of security protection while taking into account the availability of network resources and the continuity of business services. At the same time, the access control policy matrix generated based on the multi-objective constraint optimization mechanism can not only accurately block identified attack behaviors, but also implement preventive defense against potential and non-occurring attack paths. In summary, the solution of this application has the effect of improving the accuracy and adaptability of the network vulnerability defense system in dynamic and complex environments, and achieving the synergistic optimization of defense effectiveness and resource efficiency.

[0093] In one embodiment, step S32 includes:

[0094] S321: Extract the source IP address and destination IP address based on the 5-tuple information, and take the network asset corresponding to the source IP address as the attack starting point and the network asset corresponding to the destination IP address as the attack target;

[0095] In this embodiment, the five-tuple information typically refers to the source IP address, destination IP address, source port, destination port, and protocol type, which are the basic identifiers of network communication. The attack origin refers to the network asset node corresponding to the source of the attack in the directed attack graph. In this embodiment, it is directly derived from the source IP address in the security event five-tuple information. This asset has usually been controlled or utilized by the attacker and becomes the starting point for the attacker to launch further attacks. The attack target refers to the network asset node corresponding to the target that the attack intention ultimately reaches in the directed attack graph. In this embodiment, it is directly derived from the destination IP address in the security event five-tuple information. This asset is usually the object that the attacker wants to steal data, disrupt services, or obtain higher privileges.

[0096] S322: Obtain network topology connections and firewall rule sets, identify all possible paths from the attack origin to the attack target based on the network topology connections, and perform policy checks and path filtering on all possible paths through the firewall rule sets to generate a set of reachable paths;

[0097] In this embodiment, network topology refers to a logical or physical graph describing how devices in a network are interconnected. Typically, it defines all possible physical and logical paths for data packets to travel from one device to another, such as which routers, switches, VLANs, etc., are used. This can be obtained from a network management system, SNMP protocol, LLDP protocol, or network configuration file. Firewall rule set refers to the collection of all access control policies configured on firewall devices deployed at network boundaries or critical areas. The rules in the firewall rule set explicitly specify which traffic will be allowed or denied. Reachable path set refers to the set of actually reachable paths obtained after filtering by network topology and firewall rules, excluding paths explicitly denied or impassable by firewall rules. For example, topology information and firewall configuration can be obtained through a network device configuration export tool. Then, a graph algorithm can be used to traverse the topology graph to identify paths and simulate the transmission of data packets on the identified paths, determining their reachability based on firewall rules.

[0098] S323: By querying the pre-set vulnerability database, vulnerabilities are identified in all potential vulnerability exploitation paths and their corresponding attack targets in the reachable path set, a list of all associated known security vulnerabilities is obtained, and vulnerability information for each vulnerability is extracted from the list of known security vulnerabilities. The vulnerability information includes a general basic score, exploit code maturity level, and vulnerability remediation status parameters.

[0099] In this embodiment, the vulnerability database refers to a knowledge base that stores information on known software and hardware vulnerabilities, such as a CVE database. The vulnerability database is typically indexed based on common vulnerability disclosure numbers, and each entry includes a vulnerability description, affected products, versions, severity rating, and remediation suggestions. The known security vulnerability list refers to a structured list of publicly disclosed security vulnerabilities related to a specific target asset, returned after querying the vulnerability database and identifying potential exploit paths and their corresponding attack targets. Vulnerability information refers to the set of technical metadata extracted from the vulnerability database for each vulnerability in the security vulnerability list. The common basic score refers to the basic score associated with the vulnerability in the vulnerability information, which represents... The table presents a quantitative score for the severity of the corresponding vulnerability. The general basic score is typically calculated based on factors such as the complexity of exploitation, required privileges, and impact on confidentiality, integrity, and availability. The exploit code maturity level is an indicator of the development sophistication and public availability of the exploit code. A higher maturity level indicates a lower barrier to entry for attackers. Maturity levels are generally categorized as unverified, proof-of-concept, weaponized, and self-propagating. The vulnerability remediation status parameter quantifies the progress of vulnerability patching in the current target network environment. Typically, it is calculated as the patch deployment ratio, which is the ratio of the number of assets with the patch installed to the total number of affected assets.

[0100] S324: Calculate the dynamic exploitability score for each vulnerability in the security vulnerability list based on vulnerability information;

[0101] In this embodiment, the dynamic exploitability score refers to the quantitative index of exploitability based on the inherent attributes and real-time status of a comprehensive vulnerability after dynamic adjustment. Specifically, it can be calculated by the vulnerability base score, the exploit code maturity weight coefficient, and the repair status ratio to dynamically reflect the ease or difficulty of exploiting the vulnerability.

[0102] S325: Obtain the asset business level and data sensitivity level of the attack target, and calculate the asset value coefficient through a pre-set asset value calculation function;

[0103] In this embodiment, the asset business level refers to a classification index based on the importance of the business functions carried by the asset. Generally, the higher the asset business level, the greater the loss caused by the interruption of the business. The data sensitivity level refers to a classification setting based on the sensitivity of the data stored, processed, or transmitted by the asset. For example, personal identification information and financial data are usually top secret, internal emails are confidential, and public information is public. The higher the data sensitivity level, the more serious the consequences of data leakage. The asset value coefficient is an asset importance index quantified based on the asset business level and data sensitivity level. Specifically, it can use a weighted function to normalize the asset attributes to measure the value impact of the attack target.

[0104] S326: Calculate the risk flow value of each potential vulnerability exploitation path based on dynamic availability score and asset value coefficient.

[0105] In this embodiment, the risk flow value refers to the quantitative value of the threat posed by the attack path to the target asset. Specifically, it can be calculated by multiplying the dynamic availability score and the asset value coefficient to assess the priority of different paths.

[0106] Specifically, the system extracts five-tuple information from security incident data to identify the source and target network assets of the attack. By acquiring network topology connections and firewall rule sets, it identifies all theoretically possible paths from the attack origin to the target, and uses firewall rules for policy checks and filtering to generate a set of truly reachable paths. This ensures that subsequent risk assessments are based on actually feasible attack paths, avoiding ineffective analysis of unreachable paths. Furthermore, by querying a pre-configured vulnerability database, it identifies vulnerabilities in attack targets and path components on reachable paths, obtaining detailed vulnerability information including general basic scores, exploit code maturity levels, and vulnerability remediation status parameters. This allows for the calculation of a dynamic exploitability score for each vulnerability, assessing the likelihood of actual exploitation. To comprehensively measure the potential impact of the attack, it also acquires the asset business level and data sensitivity level of the target, quantifying them into asset value coefficients using a pre-defined asset value calculation function. Finally, by combining the comprehensive dynamic exploitability score and asset value coefficients, it calculates the risk flow value of each potential vulnerability exploitation path.

[0107] Through the above technical solutions, this application can conduct a comprehensive risk assessment of potential vulnerability exploitation paths in the network; by clearly defining the attack starting point and target, and combining network topology and firewall rules to filter out truly reachable attack paths, it can avoid wasting resources on invalid paths; furthermore, by comprehensively considering the general basic score of the vulnerability, the maturity level and remediation status of the exploit code, as well as the asset business level and data sensitivity level of the attack target, it can calculate a more practically meaningful dynamic exploitability score and asset value coefficient, and finally quantify it into a risk flow value; through this specific calculation of the risk flow value, it can use more accurate risk information as optimization objectives and constraints when constructing multi-objective constrained optimization problems, thereby generating a more targeted and effective access control policy matrix, which has the effect of improving the accuracy and response efficiency of network vulnerability defense strategies, and further reducing the probability of successful network attacks and the losses they may cause.

[0108] Furthermore, the dynamic availability score is calculated as follows:

[0109] ,in, For general basic scoring, To utilize code maturity levels, This is based on the maturity weight coefficients obtained using code maturity levels. To fix the vulnerability status parameters, This represents the percentage of vulnerability remediation status obtained based on vulnerability remediation status parameters.

[0110] In the aforementioned dynamic exploitability score calculation formula, the general base score refers to a quantitative indicator for standardized assessment of vulnerability severity, specifically implemented using the baseline score value of the Common Vulnerability Scoring System (CVSS), reflecting the basic threat level of the vulnerability; the exploit code maturity level refers to a classification indicator of the completeness of vulnerability exploit code development, specifically implemented using the grading standards in the vulnerability exploit code maturity evaluation framework, measuring the technical difficulty for an attacker to actually exploit the vulnerability; the maturity weight coefficient refers to a weighting parameter set according to the exploit code maturity level, specifically implemented using a preset level-weight mapping table for coefficient matching, used to dynamically adjust the vulnerability exploitability assessment results; the vulnerability remediation status parameter refers to a quantitative description of the vulnerability remediation progress, specifically implemented using the patch installation status or temporary mitigation implementation status recorded in the vulnerability management system, reflecting the degree to which the vulnerability has been remediated; the remediation status ratio refers to the remediation impact factor obtained from the vulnerability remediation status parameter, specifically implemented using linear interpolation or piecewise function calculation, used to characterize the weakening effect of remediation measures on vulnerability exploitability.

[0111] Furthermore, the risk flow value is calculated as follows:

[0112] ,in, The highest dynamic exploitability score among all vulnerabilities associated with the attack target. This represents the asset value coefficient of the target being attacked.

[0113] In the aforementioned risk flow value calculation formula, the dynamic availability score refers to a dynamically adjusted quantitative score that reflects the true threat level of a specific vulnerability in the current environment. The dynamic availability score can make risk assessment more realistic. For example, if a high-scoring vulnerability has been fully patched, its dynamic availability score will approach 0; if a low-scoring vulnerability has weaponized exploit code, its dynamic availability score will increase significantly. The asset value coefficient is a normalized index that comprehensively quantifies the importance of the target asset, reflecting the business criticality of the asset and the sensitivity of all processed data.

[0114] In one embodiment, step S322 includes the following steps:

[0115] S3221: Construct a network environment model containing fine-grained access control policies based on network topology connections and firewall rule sets;

[0116] In this embodiment, step S3221 refers to establishing a model that can comprehensively reflect the actual network security strategy and topology. This network environment model can be established by collecting network device configuration information and security policy configurations, integrating them into a unified data structure, such as a graph database or relational database, and defining the attributes of nodes and edges. Alternatively, network topology discovery tools can be used to automatically detect the network structure, and combined with the fine-grained access control policy files exported from the corresponding policy management platform, the model can be converted into a computable logical model through parsing and modeling languages.

[0117] S3222: Based on a network environment model, simulated data packets are initiated from the attack origin and virtually propagated to the attack target along all possible paths;

[0118] In this embodiment, step S3222 refers to dynamically testing the connectivity of all potential paths in the network by simulating the propagation of real traffic, and triggering security policies along the path. The simulated data packets are virtual data units used to test network connectivity and policy effectiveness; they typically carry characteristics similar to real attack traffic. Virtual propagation refers to simulating the forwarding process of data packets in a constructed network environment model without affecting the actual network operation. This can be achieved by using a graph traversal algorithm to explore all possible paths from the attack origin to the attack target in the network environment model. Specifically, for each path, a corresponding simulated data packet is generated and assigned characteristics related to potential attack behavior, and then forwarded along that path. Alternatively, network simulation tools or sandbox environments can be used to inject simulated data packets into a virtualized network topology and observe their forwarding behavior.

[0119] S3223: During virtual propagation, the firewall rule set and fine-grained access control policies are loaded in real time, hop-by-hop policy verification is performed on simulated data packets, and the nodes and links where simulated data packets are rejected by policies during virtual propagation are recorded.

[0120] In this embodiment, step S3223 aims to ensure the accuracy and real-time performance of policy verification, reflecting the dynamic effectiveness of policies in the actual network, thereby identifying paths blocked by security policies. Real-time loading of firewall rule sets and fine-grained access control policies refers to acquiring and applying all relevant security rules and policies configured on each node when a simulated data packet arrives at that node in the network environment model. Hop-by-hop policy verification means checking the simulated data packet at each node it passes through according to the policy rules associated with that node to determine whether passage is allowed. Furthermore, hop-by-hop policy verification can query the policy database of the current node when the simulated data packet is forwarded from one node to the next, acquiring all applicable firewall rules and fine-grained access control policies, and then matching the characteristics of the simulated data packet with the queried rules and policies to determine whether passage is allowed. If denied, the current node and the inbound link that caused the denial are recorded.

[0121] S3224: Mark the path containing a node or link that has been rejected by the policy as unreachable and exclude it from all possible paths;

[0122] In this embodiment, step S3224 aims to ensure that the final generated set of reachable paths is a path that can actually be used by attackers, avoiding the inclusion of invalid paths in subsequent processes, thereby improving efficiency and accuracy. Here, marking paths containing nodes or links rejected by the policy as unreachable means removing paths that are explicitly blocked by security policies during the virtual propagation process from the set of potential attack paths. Specifically, during the hop-by-hop policy verification process, once a simulated data packet is rejected on a certain node or link, the complete path traversed by the simulated data packet is marked as unreachable, and after all simulations are completed, all paths marked as unreachable are removed from the set of all possible paths initially identified.

[0123] S3225: Among all possible paths, exclude the paths marked as unreachable and output the set of reachable paths.

[0124] In this embodiment, the reachable path set refers to the attack paths that an attacker can actually exploit under the current network security policy after policy verification. Specifically, it can be achieved by collecting all the remaining paths after the above exclusion process and storing them in a storable data structure, such as a list or set, as the reachable path set; or, all reachable paths can be directly used to update the edges of the directed attack graph, that is, only the potential vulnerability exploitation paths that have been confirmed as reachable are retained in the attack graph.

[0125] Specifically, a network environment model is constructed that includes network topology connections and fine-grained access control policies. This model not only reflects physical or logical connections, but more importantly, it also incorporates firewall rule sets and access control policies. Based on this, simulated data packets are initiated from the attack origin and virtually propagated to the attack target along all possible paths. This simulates the actual flow of attack traffic in the network, rather than inferring from a static topology. During the virtual propagation of the simulated data packets, firewall rule sets and fine-grained access control policies along the path are loaded and applied in real time to perform hop-by-hop policy verification on the simulated data packets. That is, at each network node and link the data packet passes through, the actual security policy on that node or link is used to determine which paths will be blocked by the security policy. Once a simulated data packet is rejected by a policy at a node or link, the information of that node and link is recorded. Finally, any path containing a node or link that has been rejected by the policy is marked as unreachable and excluded from all possible paths.

[0126] Through the above technical solutions, this application constructs a network environment model containing fine-grained access control policies and a virtual propagation mechanism that uses simulated data packets for hop-by-hop policy verification. This enables accurate identification of attack paths that attackers can actually utilize under the current network security policies, allowing the generated directed attack graph to more accurately reflect the true security situation of the network and avoid misjudging unreachable paths as reachable paths. This improves the overall effectiveness and response efficiency of network vulnerability defense and effectively reduces false positive and false negative rates.

[0127] In one embodiment, step S3223 includes the following steps:

[0128] S32231: When a simulated data packet arrives at a node in the network environment model, the local security policy, security group inheritance policy, and node state derivation policy of that node are obtained to form an immediately effective policy set.

[0129] In this embodiment, local security policy refers to security rules directly configured on a specific network node, such as access control lists and port security settings on a router or firewall. These rules are typically designed for specific functions or services carried by the node. Security group inheritance policy refers to security rules automatically inherited from a security group when a network node is a member of that group. For example, in a cloud computing environment, a virtual machine instance inherits the inbound and outbound rules of its security group to achieve unified security protection for members within the group. Node state-derived policy refers to security policies dynamically generated or adjusted based on the real-time operating status of the network node. For example, when a node is overloaded, traffic limiting policies can be dynamically enabled or low-priority connections can be rejected to ensure the stability of core services. The immediate-effective policy set refers to a complete set of rules formed by integrating the various policies described above, used for policy verification of simulated data packets.

[0130] S32232: Identify the message characteristics of simulated data packets and match the message characteristics with the set of policies that take effect immediately. If a clear rule is matched, record the matching result. If no clear rule is matched, make a judgment based on the pre-configured policy derivation rules and record the judgment result.

[0131] In this embodiment, the message characteristics of a simulated data packet refer to the information features contained in the packet header and payload, such as source IP address, destination IP address, source port, destination port, protocol type, and possible application layer identifiers. Matching the message characteristics with the immediate-effective policy set means comparing each characteristic of the simulated data packet with each policy rule in the immediate-effective policy set to determine whether the simulated data packet meets the conditions of a certain rule. This matching process usually involves logical judgments on fields such as IP address, port range, and protocol type. An explicit rule refers to one or more policies in the immediate-effective policy set that completely match the message characteristics of the simulated data packet. The rules are abbreviated, such as a rule that explicitly allows or denies access to a specific destination port from a specific source IP address; the matching results record the corresponding decision of whether the simulated packet is allowed or denied; among them, the policy derivation rules are used to handle the situation where the packet characteristics of the simulated packet fail to match any explicit rule in the immediately effective policy set, so as to ensure that the corresponding security decision can be made even in the absence of explicit rules. For example, the least privilege principle of the security group can be used for derivation, that is, all traffic that is not explicitly allowed is denied by default, or the default isolation policy of the network area can be used for derivation, that is, all communication between different security areas is denied by default, unless there is an explicit mutual access rule.

[0132] S32233: When a conflict is detected in the judgment result, a pre-configured conflict resolution strategy is triggered to generate a corrected judgment result;

[0133] In this embodiment, a conflict in the judgment results refers to contradictory decisions made for the same simulated data packet during policy matching or derivation. For example, one policy allows the simulated data packet to pass, while another policy rejects it. The conflict resolution policy is a pre-set rule used to resolve conflicts in the judgment results. For example, a rejection priority principle can be set, meaning that as long as there is a rejection rule, the final judgment is rejection; or a highest priority principle can be set, meaning the final result is determined according to the preset priority of the policy; or a latest policy priority principle can be set, meaning that the most recently updated or loaded policy is used.

[0134] S32234: Based on the matching result or the correction judgment result, perform policy verification on the simulated data packet at this node. If the verification result is rejection, record the node and the corresponding inbound link. If the verification result is permission, allow the simulated data packet to continue virtual propagation to the next hop node along the current path.

[0135] In this embodiment, policy verification refers to determining the outcome of the simulated data packet on the network node based on the determined matching result or the correction judgment result. If the verification result is rejection, it indicates that there is a security policy restriction on the node or its inbound link, and the simulated data packet cannot pass through. At this time, the node and the inbound link that caused the rejection are recorded. If the verification result is permission, it means that the simulated data packet can safely pass through the current node and is allowed to propagate virtually along the current path to the next node in the network to continue hop-by-hop policy verification.

[0136] Specifically, when a simulated data packet arrives at any node in the network environment model, the local security policy, security group inheritance policy, and node state derived policy of that node are first obtained and integrated to form an immediately effective policy set. This ensures that when performing policy verification, all relevant security rules of that node that may affect the behavior of data packets at the current moment are fully considered. Key message characteristics of the simulated data packet are identified and matched against the constructed immediately effective policy set to determine whether the data packet conforms to any explicit allow or deny rules. Furthermore, to address situations where policy rules are difficult to fully cover in complex network environments, a policy derivation mechanism is introduced. When the message characteristics of a packet fail to match any explicit rule, a judgment is made based on pre-configured policy inference rules to avoid decision ambiguity caused by missing rules. Furthermore, once a contradiction or conflict is identified in the judgment results for the same simulated data packet, a pre-configured conflict resolution policy is triggered to generate a clear and unambiguous corrected judgment result. Finally, based on the matching result or the corrected judgment result after conflict resolution, a policy verification is performed on the simulated data packet at this node. If the verification result is rejection, the node and its corresponding inbound link are recorded. If the verification result is permission, the simulated data packet is allowed to continue virtual propagation to the next hop node along the current path.

[0137] Through the above technical solutions, this application integrates multiple security policies existing on the network nodes traversed by simulated data packets during hop-by-hop policy verification, thereby constructing a more accurate and real-time policy set that takes effect immediately. Simultaneously, by introducing policy derivation rules, it addresses the problem of ambiguity in decision-making when clear policy rules are lacking, ensuring the integrity of policy verification. Furthermore, through pre-configured conflict resolution strategies, it effectively handles contradictions between policy rules in complex network environments, generating clear correction judgment results and avoiding path judgment uncertainty caused by policy conflicts. In summary, this application improves the accuracy and reliability of policy verification for simulated data packets during virtual propagation, enabling the generated set of network reachable paths to more realistically reflect the network security situation and enhance the accuracy and responsiveness of network vulnerability defense.

[0138] In one embodiment, step S50 includes the following steps:

[0139] S51: Collect real-time operational data after the execution device executes the network control command set. The real-time operational data includes attack interception success rate, service false interception rate, and real-time performance indicators.

[0140] In this embodiment, real-time operational data refers to quantitative indicators collected after the network control command set is issued and executed, which can reflect the policy execution effect and service operation status in real time. Real-time operational data includes attack interception success rate, service false interception rate, and real-time performance indicators. Attack interception success rate is the ratio of the number of successfully blocked attack requests to the total number of attack requests after the network control command set is executed, used to measure the effectiveness of the policy. Service false interception rate is the ratio of the number of incorrectly blocked normal service requests to the total number of normal requests after the network control command set is executed, used to measure the accuracy of the policy. Real-time performance indicators refer to the performance loss of the network and system after the network control command set is executed, such as the increase in network latency, device CPU utilization, memory utilization, and the percentage decrease in throughput, used to measure the economic efficiency of the policy.

[0141] S52: Input real-time running data into a pre-trained game theory evaluation model, so that it can calculate the comprehensive payoff value based on the current network control instruction set through a multi-objective payoff function;

[0142] In this embodiment, the game theory evaluation model refers to a mathematical model based on the game theory framework. Specifically, it can be implemented using a multi-agent game algorithm to simulate the strategic interaction between the attacker and defender and to quantitatively evaluate the defense effect. That is, to quantitatively evaluate the net benefit brought by the network control command set adopted by the defender under the current adversarial situation. The game theory evaluation model can be constructed as a model trained based on reinforcement learning. Its multi-objective payoff function can comprehensively consider multiple factors such as attack interception success rate, service false interception rate, and real-time performance indicators, and calculate the comprehensive payoff value through weighted summation or multi-attribute utility function. Alternatively, it can be constructed as a game tree model based on expert knowledge and historical data, where each node represents a corresponding attack and defense decision point, and the leaf nodes represent different attack and defense results. The multi-objective payoff function is a composite evaluation function that comprehensively considers attack interception success rate, service continuity guarantee, and resource consumption. Specifically, it can be implemented using a weighted linear combination method to balance the protection effectiveness of different dimensions. The comprehensive payoff value is a scalar value calculated by substituting the currently collected real-time running data into the multi-objective payoff function, used to comprehensively and quantitatively evaluate the effect of the current defense strategy.

[0143] S53: Obtain the benchmark return value, compare the comprehensive return value with the benchmark return value, and select the optimization direction of the strategy parameters based on the comparison results;

[0144] In this embodiment, the benchmark return value refers to the return value used as a comparison reference, which typically represents an acceptable or expected performance level; the strategy parameter optimization direction refers to the conclusion drawn by comparing the comprehensive return value with the benchmark return value, which is used to indicate how to adjust the strategy parameters.

[0145] For example, if the overall return is greater than or equal to the benchmark return, the current strategy parameters are valid and can be fine-tuned to further improve their performance; if the overall return is less than the benchmark return, the current strategy parameters are invalid and a large-scale search needs to be performed in the parameter space to try new parameter combinations in order to escape the current invalid local optimum.

[0146] S54: Based on the optimization direction of strategy parameters, the trigger threshold and effect strength parameters of resource concealment rules are adjusted in real time, and the effect probability and effect range parameters of path collapse rules are updated synchronously.

[0147] In this embodiment, the resource concealment rule refers to a proactive defense strategy that disables an attacker's attack chain by hiding or disguising the real assets and services in the network, making it difficult for the attacker to discover or locate the target. The trigger threshold refers to the sensitivity of controlling when to activate the concealment strategy; the effect strength refers to the degree of control over concealment behavior. The path collapse rule is a graph-based proactive defense strategy that specifically identifies and dynamically interrupts key relay nodes in the attack path, thereby blocking the vulnerability exploitation chain and preventing the attacker's lateral movement. The effect probability refers to the probability of performing collapse on a suspected attack path; a high probability may lead to false positives, while a low probability may result in insufficient response. The scope of effect refers to the range of influence of the path collapse rule, such as blocking a single IP address or blocking a network segment, or temporarily blocking for 1 minute or 10 minutes.

[0148] Specifically, during the defense strategy execution phase, probes deployed on network nodes collect attack interception success rates, service false interception rates, and device performance indicators in real time, forming a real-time operational data set. This real-time operational data is then input into a pre-trained game theory evaluation model, which calculates the comprehensive benefit value of the current strategy based on a multi-objective payoff function. When the comprehensive benefit value is lower than a preset benchmark, it indicates that the existing parameter configuration has failed to effectively balance security protection and business operation requirements. In this case, the game theory evaluation model will generate targeted parameter adjustments. For example, when the service false interception rate is detected to rise above a threshold, the strength parameter of the resource concealment rule is reduced to decrease the probability of compliant services being misjudged. When an attack path is found to bypass existing defenses, the scope parameter of the path collapse rule is increased to cover more potential attack surfaces.

[0149] As a specific implementation method, the game theory evaluation model is constructed using a multi-agent reinforcement learning framework, and its architecture includes the following components:

[0150] The multi-agent architecture includes at least one defender agent and at least one attacker agent. Specifically, the action space of the defender agent is the selectable defense strategy, i.e., different sets of network control instructions, and its state space is the current network situation, including real-time running data and historical state sequences. The action space of the attacker agent is the simulated attack behavior, such as port scanning, vulnerability exploitation, lateral movement, etc., and its state space is the network visibility and acquired permissions from the attacker's perspective.

[0151] In this embodiment, the real-time running data recorded in the previous embodiment is preprocessed and combined with the historical running data values ​​of the past T time steps to form a time-series state vector, which is used as the input of the game theory evaluation model. The design of the multi-objective payoff function can be achieved by weighted linear combination, and the corresponding weight coefficients can be adjusted according to business needs.

[0152] Furthermore, it also includes policy networks and value networks. The policy network can typically use a deep neural network, whose input is a temporal state vector and whose output is the probability distribution of choosing different defensive actions in the current state. Similarly, the value network can also use a deep neural network, whose input is a temporal state vector and whose output is an estimate of the long-term expected return of the current state.

[0153] Meanwhile, to enable those skilled in the art to reproduce the training of the game theory evaluation model, the following are exemplary training steps:

[0154] The training environment setup and initialization specifically includes:

[0155] The topology, asset configuration, vulnerability distribution, and normal business traffic patterns of the target network for which situational awareness and vulnerability defense are required are obtained. Based on the obtained information, a programmable attack and defense simulation environment is constructed. This attack and defense simulation environment must be able to simulate the execution effect of attack behavior and the impact of defense strategies on network status and business traffic.

[0156] The defender and attacker agents are initialized separately. The policy network and value network parameters of the defender agent are randomly initialized. The attacker agent can load a predefined attack script library, or it can be trained using a reinforcement learning algorithm.

[0157] Initialize the number of training rounds, the number of interaction steps per round, the size of the experience replay buffer, the batch sampling size, the discount factor, the generalized advantage estimation parameters, etc. The specific parameter settings can be made by the technical personnel in the corresponding technical field according to the actual situation.

[0158] Multi-round interaction and training, specifically including:

[0159] At the start of each training round, the simulation environment is reset to its initial state;

[0160] For each time step in each training round, the following steps are performed: The defending agent obtains quantified data of the current network situation from the simulation environment and constructs a temporal state vector; the defending agent samples and selects a defensive action, i.e., a set of network control command parameters, based on the probability distribution of the defensive actions output by the network under its current policy; simultaneously, the attacking agent selects and executes an attack action according to its policy; finally, the simulation environment executes the actions of both parties, calculates a new temporal state vector, and calculates the immediate reward obtained by the defender based on the multi-objective reward function; the interaction data generated by the interaction (including the temporal state vector, defensive action, immediate reward, and new temporal state vector) is stored in a preset experience replay buffer.

[0161] Network parameter updates, specifically including:

[0162] Once the data in the experience replay buffer reaches a preset size, a batch of data is randomly sampled from it. Based on the sampled data, the loss function (such as mean squared error) between the value network's predicted value and the target value is calculated, and its parameters are updated using the gradient descent algorithm. Using the updated value network and the generalized advantage estimation operator, the advantage function value of each state-action pair in the sampled data is calculated. Here, the state-action pair is the combination of the state vector generated by historical interactions and the defensive action contained in the batch of data sampled from the experience replay buffer. Based on the sampled data and the calculated advantage function value, the parameters of the policy network are updated using algorithms such as proximal policy optimization to optimize the selection strategy of defensive actions and maximize long-term expected returns.

[0163] The evolution of adversarial strategies includes:

[0164] To prevent the defender's agent from overfitting to a specific attack pattern, the attacker's agent's strategy is updated periodically (e.g., after every K rounds of training). Update methods include: loading a new attack script library; or training the attacker's agent based on a reinforcement learning algorithm so that it can adaptively adjust its attack behavior in response to the defender's current strategy; continuously monitoring the average return of the defender's agent on the validation set (preset attack and defense scenarios); when the average return no longer significantly improves within several consecutive training rounds, the model training is considered to have converged, and the training process is terminated.

[0165] Through the above technical solution, this application comprehensively collects real-time operational data such as attack interception success rate, business false interception rate, and real-time performance indicators, and inputs them into a pre-trained game theory evaluation model for multi-objective benefit calculation, thereby quantifying the comprehensive benefit value of the current defense strategy; the comprehensive benefit value is compared with the benchmark benefit value to clarify the direction of strategy parameter optimization; based on the optimization direction, the trigger threshold and effect strength parameters of resource concealment rules and the effect probability and effect range parameters of path collapse rules are dynamically adjusted, which has the effect of improving the adaptability and robustness of the defense strategy, ensuring that the best defense performance is maintained in the continuously changing attack and defense confrontation, while minimizing the impact on normal business as much as possible.

[0166] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0167] In one embodiment, a situation-aware network vulnerability defense system is provided, which corresponds one-to-one with the situation-aware network vulnerability defense method described in the above embodiments. The situation-aware network vulnerability defense system includes:

[0168] The feature extraction module is used to collect raw data in the communication link and extract multimodal traffic features, including protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features.

[0169] The feature input module is used to input the extracted multimodal traffic features into the pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat level.

[0170] The matrix generation module is used to construct a preliminary control policy matrix based on security event information, and to optimize and adjust the preliminary control policy matrix through pre-set resource concealment rules and path collapse rules to generate an access control policy matrix.

[0171] The instruction generation module is used to convert the generated access control policy matrix into a network control instruction set and send it to each execution device;

[0172] The parameter adjustment module is used to monitor the network situation of the execution device that receives the network control command set. It uses a pre-trained game theory evaluation model to quantitatively evaluate the execution effect of the execution device and dynamically adjusts the parameters of the resource concealment rule and path collapse rule based on the quantitative evaluation results.

[0173] The modules in the aforementioned situational awareness-based network vulnerability defense system can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the computer device's memory as software, so that the processor can invoke and execute the corresponding operations of each module.

[0174] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A cyber vulnerability defense method with situational awareness, characterized in that, Including the following steps: Raw data from the communication link is collected, and multimodal traffic features are extracted. These multimodal traffic features include protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features. The extracted multimodal traffic features are input into a pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat levels. A preliminary control policy matrix is ​​constructed based on security event information, and the preliminary control policy matrix is ​​optimized and adjusted by pre-set resource concealment rules and path collapse rules to generate an access control policy matrix. The generated access control policy matrix is ​​converted into a network control instruction set and sent to each execution device; Network situation monitoring is performed on the execution devices that receive network control command sets. The execution effect of the execution devices is quantitatively evaluated through a pre-trained game theory evaluation model. Based on the quantitative evaluation results, the parameters of resource concealment rules and path collapse rules are dynamically adjusted. The raw data includes data packets, network flow records, network node connection information, and network device performance indicators. The steps of collecting raw data from the communication link and extracting multimodal traffic features, including protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features, include the following steps: Collect data packets in the communication link, perform protocol parsing on the data packets, and extract protocol state transition sequences and packet interaction timing as protocol behavior sequence features; Deep packet inspection is performed on the data packets to extract the application layer payload content. The payload content is segmented using a sliding window sampling method, and the Shannon entropy value is calculated to form the payload entropy value distribution characteristics. Traffic time series are constructed based on network flow records, and the traffic time series are analyzed by fast Fourier transform to extract communication periodicity features. A topology graph is constructed based on network node connection information and network device performance indicators, and the degree centrality and betweenness centrality of each node in the topology graph are calculated as connection topology features.

2. The cyber vulnerability defense method of situation awareness according to claim 1, wherein: The fusion twin network model includes a feature encoding layer, an attention fusion layer, a similarity calculation layer, an intent recognition layer, and an output generation layer. The step of inputting the extracted multimodal traffic features into the pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing five-tuple information and threat levels, includes the following steps: The feature encoding layer encodes the received multimodal traffic features to generate multimodal representation vectors; The attention fusion layer performs cross-modal correlation fusion on multimodal representation vectors to generate feature representation vectors; The similarity calculation layer calculates the multi-dimensional similarity distance between the feature representation vector and the baseline vector in the preset behavioral feature library; The intent recognition layer performs attack behavior evolution analysis based on feature representation vectors to identify attack types and attack stages; When the multi-dimensional similarity distance is lower than the preset similarity threshold, the output layer will determine the feature representation vector as an abnormal vector and generate security event information containing five-tuple information and threat level based on the attack type, attack stage and multi-dimensional similarity distance.

3. The cyber vulnerability defense method of situation awareness according to claim 1, wherein: The step of constructing a preliminary control policy matrix based on security event information, and optimizing and adjusting the preliminary control policy matrix through pre-set resource concealment rules and path collapse rules to generate an access control policy matrix includes the following steps: Based on safety time information, several preliminary control policy items are matched and generated from a predefined policy atom library, and a preliminary control policy matrix is ​​constructed based on the generated preliminary control policy items. A directed attack graph is constructed based on security event information. The nodes of the directed attack graph are network assets, and the edges are potential vulnerability exploitation paths associated with risky flow values. The initial control strategy matrix is ​​used as the input decision variable, the solution space is defined by the directed attack graph, the optimization objective is set based on the risk flow value, and the pre-set resource concealment rules and path collapse rules are used as constraints to construct a multi-objective constrained optimization problem. Solve the constructed multi-objective optimization problem to obtain the weight configuration of the optimization strategy; The initial control policy matrix is ​​optimized and adjusted based on the optimized policy weight configuration to generate an access control policy matrix.

4. The cyber vulnerability defense method of situation awareness according to claim 3, characterized in that: The step of constructing a directed attack graph based on security event information, where the nodes of the directed attack graph are network assets and the edges are potential vulnerability exploitation paths associated with risky flow values, includes the following steps: The source IP address and destination IP address are extracted based on the five-tuple information. The network assets corresponding to the source IP address are used as the starting point of the attack, and the network assets corresponding to the destination IP address are used as the target of the attack. Obtain network topology connections and firewall rule sets; identify all possible paths from the attack origin to the attack target based on network topology connections; and perform policy checks and path filtering on all possible paths through the firewall rule sets to generate a set of reachable paths. By querying a pre-set vulnerability database, vulnerability identification is performed on all potential vulnerability exploitation paths and their corresponding attack targets in the reachable path set, obtaining a list of all associated known security vulnerabilities, and extracting vulnerability information for each vulnerability from the list of known security vulnerabilities. The vulnerability information includes a general basic score, exploit code maturity level, and vulnerability remediation status parameters. Calculate the dynamic exploitability score for each vulnerability in the security vulnerability list based on vulnerability information; Obtain the asset business level and data sensitivity level of the attack target, and calculate the asset value coefficient through a pre-set asset value calculation function; The risk flow value of each potential vulnerability exploitation path is calculated based on the dynamic availability score and asset value coefficient.

5. A network vulnerability defense method based on situational awareness according to claim 4, characterized in that: The steps of obtaining network topology connections and firewall rule sets, identifying all possible paths from the attack origin to the attack target based on the network topology connections, and performing policy checks and path filtering on all possible paths through the firewall rule set to generate a set of reachable paths include the following steps: A network environment model containing fine-grained access control policies is constructed based on network topology connections and firewall rule sets. Based on the network environment model, simulated data packets are initiated from the attack origin and virtually propagated to the attack target along all possible paths. During the virtual propagation process, firewall rule sets and fine-grained access control policies are loaded in real time, hop-by-hop policy verification is performed on simulated data packets, and nodes and links where simulated data packets are rejected by policies during the virtual propagation process are recorded. Mark paths containing nodes or links rejected by the policy as unreachable and exclude them from all possible paths; Among all possible paths, exclude those marked as unreachable, and output the set of reachable paths.

6. A network vulnerability defense method based on situational awareness according to claim 5, characterized in that: The steps described above, including loading firewall rule sets and fine-grained access control policies in real time during virtual propagation, performing hop-by-hop policy verification on simulated data packets, and recording the nodes and links where simulated data packets are rejected by policies during virtual propagation, include the following steps: When a simulated data packet arrives at a node in the network environment model, the local security policy, security group inheritance policy, and node state derivation policy of that node are obtained to form an immediately effective policy set. Identify the message characteristics of simulated data packets and match the message characteristics with the set of policies that take effect immediately. If a clear rule is matched, record the matching result. If no clear rule is matched, make a judgment based on the pre-configured policy deduction rules and record the judgment result. When a conflict is detected in the judgment result, a pre-configured conflict resolution strategy is triggered to generate a corrected judgment result; Based on the matching result or the correction decision result, the policy verification of the simulated data packet at this node is performed. If the verification result is rejection, the node and the corresponding inbound link are recorded. If the verification result is permission, the simulated data packet is allowed to continue virtual propagation to the next hop node along the current path.

7. A network vulnerability defense method based on situational awareness according to claim 1, characterized in that: The steps of monitoring the network situation of the execution device receiving the network control command set, quantitatively evaluating the execution effect of the execution device through a pre-trained game theory evaluation model, and dynamically adjusting the parameters of the resource concealment rule and path collapse rule based on the quantitative evaluation results include the following steps: The system collects real-time operational data after the execution device executes the network control command set. The real-time operational data includes attack interception success rate, service false interception rate, and real-time performance indicators. Real-time running data is input into a pre-trained game theory evaluation model, which calculates the comprehensive payoff value based on the current network control instruction set through a multi-objective payoff function. Obtain the benchmark return value, compare the comprehensive return value with the benchmark return value, and select the direction for strategy parameter optimization based on the comparison results; Based on the optimization direction of strategy parameters, the trigger threshold and effect strength parameters of resource concealment rules are adjusted in real time, and the effect probability and effect range parameters of path collapse rules are updated synchronously.

8. A situational awareness-based network vulnerability defense system, characterized in that, include: The feature extraction module is used to collect raw data in the communication link and extract multimodal traffic features, including protocol behavior sequence features, payload entropy distribution features, communication periodicity features, and connection topology features. The feature input module is used to input the extracted multimodal traffic features into the pre-trained fusion twin network model, enabling the fusion twin network model to identify abnormal traffic behavior patterns and output security event information containing quintuple information and threat level. The matrix generation module is used to construct a preliminary control policy matrix based on security event information, and to optimize and adjust the preliminary control policy matrix through pre-set resource concealment rules and path collapse rules to generate an access control policy matrix. The instruction generation module is used to convert the generated access control policy matrix into a network control instruction set and send it to each execution device; The parameter adjustment module is used to monitor the network situation of the execution device that receives the network control command set. It uses a pre-trained game theory evaluation model to quantitatively evaluate the execution effect of the execution device and dynamically adjusts the parameters of the resource concealment rule and path collapse rule based on the quantitative evaluation results. The feature extraction module includes: The sequence feature extraction submodule is used to collect data packets in the communication link, perform protocol parsing on the data packets, and extract the protocol state transition sequence and message interaction timing as protocol behavior sequence features; The distribution feature calculation submodule is used to perform deep packet inspection on data packets and extract application layer payload content. It uses a sliding window sampling method to segment the payload content and calculate the Shannon entropy value to form the payload entropy value distribution feature. The periodic feature extraction submodule is used to construct a traffic time series based on network flow records and analyze the traffic time series through fast Fourier transform to extract communication periodic features. The topology feature calculation submodule is used to construct a topology graph based on network node connection information and network device performance indicators, and to calculate the degree centrality and betweenness centrality of each node in the topology graph as connection topology features.