Attack-defense use case generation method and electronic device

CN122247660APending Publication Date: 2026-06-19HANGZHOU NETEASE CLOUD MUSIC TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HANGZHOU NETEASE CLOUD MUSIC TECH CO LTD
Filing Date: 2026-03-03
Publication Date: 2026-06-19

Application Information

Patent Timeline

03 Mar 2026

Application

19 Jun 2026

Publication

CN122247660A

IPC: H04L9/40

AI Tagging

Application Domain

Securing communication

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122247660A_ABST

Patent Text Reader

Abstract

This disclosure presents a method and electronic device for generating attack and defense test cases, relating to the field of network security technology. The method includes: acquiring exercise requirement information of a target network system; generating a set of attack and defense test cases for attack and defense exercises against the target network system based on the exercise requirement information using a generative model; conducting attack and defense exercises on the target network system based on the set of attack and defense test cases using an exercise model; acquiring the attack and defense exercise performance indicators of the target network system; and generating optimization guidance information based on the attack and defense exercise performance indicators to optimize the set of attack and defense test cases generated by the generative model. This disclosure can automatically generate and optimize attack and defense test cases, effectively improving the efficiency and coverage of attack and defense test case generation, and can adaptively and dynamically optimize attack and defense test cases during attack and defense exercises.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of network security technology, specifically to a method for generating attack and defense test cases and an electronic device. Background Technology

[0002] With the development of internet technology and the increasing informatization of enterprises, businesses are facing more and more cybersecurity threats. To protect enterprise information security, attack and defense drills have become an indispensable part of enterprise security operations. Attackers typically use various methods to compromise enterprise security systems and data; therefore, enterprises need to think like attackers, identifying and strengthening their weakest points to build a more complete and reliable security system. By conducting attacks, enterprises can discover security vulnerabilities, formulate effective security strategies, and strengthen emergency response mechanisms, thereby better ensuring the safe and stable operation of the enterprise.

[0003] In attack and defense drills, attack and defense use cases refer to pre-designed, standardized attack and defense scenario implementation plans designed to verify the security protection capabilities of target systems, networks, or services. They clarify the attacker's penetration path and technical means, as well as the defender's detection, response, and handling procedures, and serve as an important basis for the orderly conduct of attack and defense drills.

[0004] In related technologies, attack and defense test case generation usually relies on manual writing or rule-based generation. In this way, the generation efficiency of attack and defense test cases is low, the coverage of attack and defense exercises is limited, and there is a lack of dynamic adaptability. Summary of the Invention

[0005] In view of this, this disclosure provides a method and electronic device for generating attack and defense test cases to solve the problem of low efficiency in generating attack and defense test cases in related technologies.

[0006] Firstly, this disclosure provides a method for generating attack and defense test cases, the method comprising:

[0007] Obtain the exercise requirements information of the target network system; Based on the exercise requirements information, a set of attack and defense test cases is generated using a generative model to conduct attack and defense exercises against the target network system. The target network system is subjected to attack and defense exercises using the exercise model and the set of attack and defense use cases. Obtain the attack and defense exercise performance indicators of the target network system; Based on the attack and defense exercise effectiveness indicators, optimization guidance information is generated to optimize the attack and defense test case set generated by the generative model.

[0008] In one embodiment of this disclosure, the exercise requirement information includes use case generation requirements and contextual information related to the target network system used to guide the generation of attack and defense use cases.

[0009] In one embodiment of this disclosure, the step of conducting attack and defense drills on the target network system using a drill model based on the attack and defense use case set includes: generating and executing attack behaviors and defense strategies against the target network system based on the attack and defense use case set and the system state of the target network system using a drill model.

[0010] In one embodiment of this disclosure, the attack and defense use case set includes at least one attack and defense use case, and the exercise model includes an attack model based on reinforcement learning and a defense model based on reinforcement learning. The step involves generating and executing attack behaviors and defense strategies against the target network system based on the attack and defense test case set and the system state of the target network system through a drill model, including: The attack model generates attack behaviors against the target network system based on the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records. The attack behavior is injected into the target network system to carry out a network attack on the target network system; Based on the defense model, a defense strategy for the target network system is generated according to the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records. According to the defense strategy, defensive measures are implemented on the target network system to counter the attack.

[0011] In one embodiment of this disclosure, the attack and defense exercise effectiveness metrics include attack effectiveness metrics and defense effectiveness metrics. The step of generating optimization guidance information based on the attack and defense exercise effectiveness metrics to optimize the attack and defense test case set generated by the generative model includes: Based on the attack effect index, the attack reward of the attack model is obtained through the attack reward function of the attack model; Based on the defense effectiveness index, the defense reward of the defense model is obtained through the defense reward function of the defense model, and the optimization guidance information includes the attack reward and the defense reward; Based on the attack reward, the attack behavior output by the attack model is iteratively optimized; Based on the defense reward, the defense strategy output by the defense model is iteratively optimized; Based on the iteratively optimized attack behaviors and defense strategies, an optimized set of attack and defense test cases is generated iteratively.

[0012] In one embodiment of this disclosure, generating optimization guidance information based on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model includes: Based on the aforementioned attack and defense exercise effectiveness indicators, optimization guidance information for the generative model is determined; The generative model is adjusted based on the optimization guidance information to generate an optimized set of attack and defense test cases.

[0013] In one embodiment of this disclosure, determining optimization guidance information for the generative model based on the attack-defense exercise effectiveness index includes: The test case quality index is obtained based on the statistical analysis of the attack and defense exercise effectiveness index; In response to the use case quality metric meeting the optimization trigger condition, optimization guidance information corresponding to the use case quality metric is generated, and the optimization guidance information includes use case generation constraints for the generative model.

[0014] In one embodiment of this disclosure, adjusting the generative model according to the optimization guidance information to generate an optimized set of attack and defense test cases includes: Input use case generation optimization instructions into the generative model, wherein the use case generation optimization instructions include the optimization guidance information; The generative model generates a new set of attack and defense test cases based on the test cases and optimization instructions.

[0015] In one embodiment of this disclosure, the step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization triggering condition includes: If the quality index of any type of attack and defense use case meets the attack optimization trigger condition, the attack type of the attack and defense use case of that type is determined as the target attack mode. Obtain the test case generation constraints corresponding to the target attack mode, and the optimization guidance information includes the test case generation constraints corresponding to the target attack mode.

[0016] In one embodiment of this disclosure, the step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization triggering condition includes: If the quality index of any type of attack and defense test case meets the effectiveness optimization trigger condition, then the attack and defense test case of any type is identified as an inefficient attack and defense test case. The root cause of inefficiency in the inefficient attack and defense test cases is determined by analyzing the execution logs of the inefficient attack and defense test cases. Based on the root cause of the inefficiency of the inefficient attack and defense test cases, obtain the test case generation constraints corresponding to the inefficient attack and defense test cases, and the optimization guidance information includes the test case generation constraints corresponding to the inefficient attack and defense test cases.

[0017] In one embodiment of this disclosure, the step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization triggering condition includes: If the quality index of any type of attack and defense use case meets the defense optimization trigger condition, the attack type of the attack and defense use case of that type is identified as the target defense blind spot. Obtain the use case generation constraints corresponding to the target defense blind spot, and the optimization guidance information includes the use case generation constraints corresponding to the target defense blind spot.

[0018] In one embodiment of this disclosure, generating optimization guidance information based on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model includes: Based on the various dimensions of the offensive and defensive exercise effectiveness indicators, evaluate the indicator scores for each dimension of the effectiveness indicators. For each dimension performance indicator, if the indicator score of the dimension performance indicator is lower than a preset score, the test case generation constraints corresponding to the dimension performance indicator are obtained, and the optimization guidance information includes the test case generation constraints corresponding to the dimension performance indicator. Input use case generation optimization instructions into the generative model, wherein the use case generation optimization instructions include the optimization guidance information; The generative model generates a new set of attack and defense test cases based on the use case generation optimization instructions.

[0019] Secondly, this disclosure provides an electronic device, including: a memory and a processor, which are communicatively connected to each other. The memory stores computer instructions, and the processor executes the computer instructions to perform the attack and defense use case generation method of the first aspect or any corresponding embodiment described above.

[0020] One embodiment of this disclosure provides a method for generating attack and defense test cases. This method generates an initial set of attack and defense test cases based on exercise requirements using a generative model. An exercise model then conducts attack and defense exercises on a target network system based on the set of test cases output by the generative model. Furthermore, it collects feedback on the effectiveness of the attack and defense exercises to generate optimization guidance information, thereby optimizing the set of attack and defense test cases generated by the generative model. Utilizing the generative model enables automated generation of attack and defense test cases for attack and defense exercises on the target network system, effectively improving the efficiency and coverage of attack and defense test case generation. By collecting and optimizing the set of attack and defense test cases generated by the generative model based on the effectiveness of the attack and defense exercises, the quality of the generated test cases is improved, achieving adaptive dynamic optimization of attack and defense test cases during the attack and defense exercises. Attached Figure Description

[0021] To more clearly illustrate the technical solutions in the specific embodiments or related technologies of this disclosure, the accompanying drawings used in the description of the specific embodiments or related technologies will be briefly introduced below. Obviously, the accompanying drawings described below are some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0022] Figure 1 This is a flowchart illustrating an attack and defense test case generation method provided in one embodiment of the present disclosure; Figure 2 This is a schematic diagram of a process for conducting attack and defense drills using a drill model, as described in one embodiment of this disclosure. Figure 3 This is a flowchart illustrating an optimization method for an attack and defense use case in one embodiment of the present disclosure; Figure 4 This is a flowchart illustrating a method for determining optimized guidance information in one embodiment of the present disclosure; Figure 5 This is a flowchart illustrating an optimization method for another attack and defense use case in one embodiment of this disclosure; Figure 6 This is a schematic diagram of a scenario for attack and defense drills and optimization based on attack and defense use cases in one embodiment of this disclosure; Figure 7 This is a structural block diagram of an attack and defense test case generation device provided in one embodiment of the present disclosure; Figure 8 This is a schematic diagram of the structure of an electronic device provided in one embodiment of the present disclosure. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of one embodiment of this disclosure clearer, the technical solutions of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this disclosure. All other embodiments obtained by those skilled in the art based on the embodiments of this disclosure without creative effort are within the scope of protection of this disclosure.

[0024] Red Team / Blue Team Exercises simulate real-world cyberattack and defense scenarios. By conducting attack tests and security drills on networks, systems, and applications, they assess an organization's information security defense capabilities, identify potential security risks and vulnerabilities, and improve the organization's ability to respond to security threats.

[0025] In attack and defense drills, attack and defense use cases refer to pre-designed, standardized attack and defense scenario implementation plans designed to verify the security protection capabilities of target systems, networks, or services. They clarify the attacker's penetration path and technical means, as well as the defender's detection, response, and handling procedures, and serve as an important basis for the orderly conduct of attack and defense drills.

[0026] In related technologies, attack and defense test case generation usually relies on manual writing or rule-based generation. In this way, the generation efficiency of attack and defense test cases is low, the coverage of attack and defense exercises is limited, and there is a lack of dynamic adaptability.

[0027] Therefore, one embodiment of this disclosure provides a method and electronic device for generating attack and defense test cases, which can automatically generate and optimize attack and defense test cases, effectively improving the efficiency and coverage of attack and defense test case generation, and can adaptively and dynamically optimize attack and defense test cases during attack and defense exercises.

[0028] According to one embodiment of this disclosure, an embodiment of an attack and defense use case generation method is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.

[0029] One embodiment of this disclosure provides a method for generating attack and defense test cases, which can be applied to servers used for attack and defense drills. Figure 1 This is a flowchart illustrating an attack and defense test case generation method provided in one embodiment of the present disclosure, as shown below. Figure 1 As shown, the attack and defense test case generation method includes the following steps S101 to S104.

[0030] Step S101: Obtain the exercise requirements information of the target network system.

[0031] The target network system refers to the network system, host devices, and business applications to be subjected to security testing and attack / defense drills. The target network system is the object of the attack / defense drills, including but not limited to physical servers, virtual hosts, containers, web applications, application services, databases, middleware, network devices, and the operating systems and business systems deployed on these objects. All attack behaviors generated during the attack / defense drills are applied to this target network system. For example, the target network system could be a distributed database cluster.

[0032] Exercise requirements information refers to the relevant constraints and reference context information used to guide generative models in generating attack and defense test cases adapted to the target network system.

[0033] Step S102: Generate a set of attack and defense test cases for attack and defense exercises against the target network system based on the exercise requirements information using a generative model.

[0034] In one embodiment of this disclosure, a generative model is a type of machine learning model that can learn the distribution patterns of real data and automatically generate new, similar, and meaningful data samples. For example, a generative model is an AIGC (Artificial Intelligence Generated Content) model. The AIGC model is an automatic content generation technology based on artificial intelligence, deep learning, and large language models. It is used to automatically generate text, scripts, configuration items, or executable use case content that conforms to scenario constraints based on input prompts and reference data.

[0035] In one embodiment of this disclosure, the generative model can understand, reason, summarize and generate based on user-input prompts, scenario constraints, structured reference information and expert experience data, and output standardized, structured and executable attack and defense test cases that can be used for network attack and defense exercises.

[0036] In step S102, the exercise requirement information of the target network system is input into the generative model, and the attack and defense test case set of the target network system is output. The attack and defense test case set is a collection of attack and defense test cases generated by the generative model based on the exercise requirement information, including at least one attack and defense test case.

[0037] Step S103: Using the exercise model, conduct attack and defense exercises on the target network system based on the attack and defense use case set.

[0038] In one embodiment of this disclosure, the attack and defense model is a simulation model for network security attack and defense confrontation (red-blue confrontation) built based on reinforcement learning, adversarial simulation or rule engine. It includes two parts: an attack model and a defense model. It is used to receive attack and defense test cases output by the generative model and to carry out attack and defense confrontation exercises on the target network system in a simulation or test environment. The attack model is responsible for outputting attack behavior, and the defense model is responsible for detecting attack behavior in the target network system and generating corresponding defense strategies.

[0039] As a specific implementation method, the pre-defined training model includes an attack model and a defense model based on reinforcement learning. The attack model is implemented using the Proximity Policy Optimization (PPO) algorithm. Its state space S includes: the target network system's current performance metric vector (CPU, memory utilization), network connection state vector, and the feature encoding vector of the current attack and defense test cases. The action space A is a predefined set of attack actions, such as {'Initiate port scan', 'Inject SQL statement', 'Send malicious payload'}. The reward function R1 is designed as: R1=0.7 Attack success rate +0.2 Attack chain completion -0.5 Attack detection rate by the defense model.

[0040] The defense model also employs the PPO algorithm, sharing a state space with the attack model and adding defense log features. Its action space is a set of defense actions, such as {'Enable WAF rule X', 'Add IP to blacklist', 'Adjust anomaly detection threshold to Y'}. The reward function R² is designed as: R² = 0.6 Attack interception success rate -0.3 Average response time -0.3 False positive rate. Attack and defense models are trained adversarially in a simulated environment until the policy converges.

[0041] Based on the exercise model and the set of attack and defense test cases generated by the generative model, an attack and defense confrontation exercise is carried out on the target network system. The attack and defense test cases in the set of attack and defense test cases provide initial attack operation sequences and defense measure suggestions. In the attack and defense model, the attack model generates attack behaviors according to the attack and defense test cases in the set of attack and defense test cases, injects them into the target network system to carry out network security attacks, triggers the defense mechanism of the target network system, and the defense model generates defense strategies according to the attack and defense test cases in the set of attack and defense test cases. The target network system implements corresponding defense measures according to the defense strategy of the defense model, thereby completing an attack and defense exercise.

[0042] Step S104: Obtain the attack and defense exercise effect indicators of the target network system.

[0043] During the attack and defense drills on the target network system, the effectiveness indicators of the attack and defense drills are observed and obtained from the target network system. These effectiveness indicators are multi-dimensional quantitative evaluation data obtained by collecting, statistically analyzing, and normalizing the execution process and attack results of one or more attack and defense test cases, as well as the defense detection, alarm, interception, and defense behaviors of the target network system, after the attack and defense model has completed the attack and defense drills on the target network system based on the attack and defense test case set generated by the generative model. These effectiveness indicators can objectively reflect the breakthrough capability of the attack and defense test cases on the target network system, the defense capability of the target network system's defense system, and the quality, effectiveness, and efficiency of the attack and defense test cases.

[0044] Step S105: Based on the attack and defense exercise effect indicators, generate optimization guidance information to optimize the attack and defense test case set generated by the generative model.

[0045] By evaluating the effectiveness of attack and defense drills, the quality of attack and defense test cases generated by the generative model can be determined. This allows for the construction of feedback optimization guidance information, which is then fed back to the generative model through input instructions. This enables dynamic optimization of the generative model's prompts, generation constraints, generated test cases, and preferences, thereby optimizing the generative model's attack and defense test case generation strategy. Ultimately, this allows the generative model to automatically output a set of attack and defense test cases that are of higher quality, more closely aligned with the target network system, and more suitable for attack and defense drills.

[0046] According to one embodiment of the attack and defense test case generation method provided in this disclosure, an initial attack and defense test case set is generated based on the exercise requirements information through a generative model. An exercise model is then used to conduct attack and defense exercises on the target network system based on the attack and defense test case set output by the generative model. Optimization guidance information is generated by collecting feedback on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model. The generative model enables automated generation of attack and defense test cases for attack and defense exercises on the target network system, effectively improving the efficiency and coverage of attack and defense test case generation. By collecting attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model, the quality of the attack and defense test cases generated by the generative model is improved, and adaptive dynamic optimization of attack and defense test cases is achieved during the attack and defense exercise process.

[0047] In some embodiments, in step S101, the exercise requirement information includes use case generation requirements and contextual information related to the target network system used to guide the generation of attack and defense use cases.

[0048] As a specific implementation method, the context information may include at least one of the following: historical network fault information, system architecture information, system code analysis information, and expert exercise cases for the target network system configuration.

[0049] Among them, the use case generation requirement is the prompt words input to the generative model. The prompt words are used to guide the generative model in the direction of generation, specifying the type, structure, strength, language specification, output format, and generation constraints and rules of the attack and defense use cases to be generated. This provides direct execution guidance and rule constraints for the generative model to generate attack and defense use cases.

[0050] Use case generation requirements can be pre-configured prompts that instruct the generative model to generate the required attack and defense use cases.

[0051] For example, one prompt word is: "You are a cybersecurity attack and defense test case generation expert. Based on the information of the target network system, please generate executable attack and defense test cases that conform to the ATT&CK framework."

[0052] Require: 1. Each test case is in JSON format and includes the following fields: Test Case ID, Attack Type, Attack Target, Preconditions, Execution Steps, Payload, Protocol Type, Expected Result, and Risk Level; 2. The use cases are feasible and reproducible, and the payload conforms to HTTP / MySQL / Linux command specifications; 3. Coverage: Information gathering, vulnerability detection, vulnerability exploitation, privilege escalation, lateral movement, and persistence; Target network system environment: {OS type, middleware, open ports, business systems, defense device type}; Please generate N attack and defense test cases. In some embodiments, historical network fault information is provided as reference context information input to the generative model to guide the model in extracting relevant historical security events and fault modes of the target network system, ensuring the practical relevance and timeliness of the generated attack and defense use cases. Historical network fault information can be obtained based on historical operational data.

[0053] In some embodiments, expert practice test cases configured for the target network system integrate expert experience and best practices. By providing expert practice test cases configured for the target network system as reference context information input to the generative model, high-quality seeds and knowledge guidance are provided to the generative model. These expert practice test cases configured for the target network system can be obtained from a library of manually written test cases.

[0054] In some embodiments, by providing system architecture information as reference context information input to the generative model, the generative model is guided to understand the components, dependencies, and communication protocols of the target network system, and to generate more targeted attack and defense use cases that cover critical paths and component-level fault scenarios.

[0055] In some embodiments, system code analysis information is provided as reference context information input to the generative model to guide the generative model in generating attack and defense test cases that can accurately trigger specific code defects or logical vulnerabilities. The system code analysis information includes static code analysis results and dynamic code analysis results. Static code analysis results are obtained by scanning with tools such as SonarQube, Checkmarx, or CodeQL through code repository access, while dynamic code analysis results are obtained through runtime monitoring and instrumentation techniques.

[0056] In some embodiments, by providing various information such as historical network fault information, system architecture information, system code analysis information, and expert exercise test cases configured for the target network system as input reference context information for the generative model, multi-source heterogeneous information data is fused to guide the generative model in generating comprehensive and complex attack and defense test cases.

[0057] In some embodiments, after obtaining information such as historical network fault information, system architecture information, system code analysis information, and expert exercise cases configured for the target network system, the text data of this information is converted into vector representations using an Embedding model (such as text-embedding-ada-002), and the vector representations of different information are stored in a vector database for easy similarity retrieval.

[0058] For example, a historical network failure message is: "Database master-slave switch failure", which can be converted into a 1536-dimensional vector representation: [0.23, -0.45, 0.67, ..., 0.12].

[0059] In some embodiments, the output format of attack and defense test cases is JSON format, including but not limited to the following fields: Test case title: clearly describes the attack scenario and attack type; Test case fault injection steps: detailed and executable attack operation sequence (simulating vulnerability exploitation, configuration errors, malicious behavior, etc.); Test case observation information: clearly indicates the key indicators, logs, and traffic characteristics that need to be observed in the exercise for evaluating the exercise effect; Test case contingency plan: provides expected defense measures suggestions or remediation solutions (which can be generated based on historical data or best practices); Test case result: preset success / failure judgment criteria.

[0060] For example, an attack and defense use case can be represented as follows: Use case title: "Data consistency failure during distributed database master-slave switchover"; Use case fault injection steps: 1. Simulate AZ1 failure: "aws dynamodb force-failover --table ProductionTable"; 2. Inject into network partition: "iptables block traffic from AZ1 to AZ2 on port 9042"; 3. Trigger write conflict: "Concurrent write to the old master node of AZ1 (expected rejection)"; Use case observation information: "Distributed database_ReplicationLag>5s", "Active Node SwitchCount"; Use case scenarios: 1. Automatic: "Enable Quorum write protocol"; 2. Manual: "Start data difference verification script".

[0061] In some embodiments, in step S103 above, the target network system is subjected to attack and defense exercises based on the attack and defense use case set through the exercise model, including: generating and executing attack behaviors and defense strategies against the target network system based on the attack and defense use case set and the system state of the target network system through the exercise model.

[0062] In some embodiments, the training model includes a reinforcement learning-based attack model and a reinforcement learning-based defense model.

[0063] Among them, the reinforcement learning-based attack model is a reinforcement learning attack agent constructed for attack and defense exercise scenarios of target network systems. As the core unit of attack-side decision-making and execution of the attack and defense model, it can receive attack and defense test cases generated by generative models as basic attack strategy inputs. In real-time confrontation with the target network system and defense mechanism, it learns and optimizes attack behavior autonomously through system state perception, attack behavior selection, attack effect reward evaluation and strategy iteration, including attack path selection, payload construction, encoding method, attack rhythm and behavior sequence, with the goal of maximizing attack rewards and achieving optimal attack effect.

[0064] The input to a reinforcement learning-based attack model includes attack and defense test cases output by a generative model, the system state of the target network system, and historical attack and defense records. The output is the attack behavior against the target network system. The system state includes, but is not limited to, the component states, network state, and load conditions of the target network system. Historical attack and defense records include historical attack behaviors, attack effects, and historical defense strategies and their effects. Attack behaviors include specific attack command sequences; for example, attack commands against a distributed database cluster include iptables blocking and force-failover.

[0065] In the process of optimizing attack behavior through reinforcement learning, the attack reward for each step of the attack behavior is accumulated through a pre-defined attack reward function. This reward is used to guide the attack model to optimize its attack strategy and improve attack effectiveness. Furthermore, the attack reward function is a quantization function in the reinforcement learning-based attack model used to evaluate the value of attack behavior and output reward signals. By comprehensively scoring each attack behavior initiated by the reinforcement learning-based attack model in conjunction with attack and defense exercise performance indicators, positive / negative reward signals are output. Positive rewards guide the model to strengthen effective attack behaviors (such as blind penetration and attacks without alarms), while negative rewards suppress ineffective attack behaviors (such as being intercepted or triggering high-frequency alarms), thereby driving the attack model to iteratively optimize the output attack behavior.

[0066] For example, in a scenario where the target network system is a distributed database cluster, the attack reward function can be designed as: R1 = w1 × trigger rate + w2 × data disorder - w3 × detection rate. The trigger rate is the probability that the attack will successfully trigger a failure, the data disorder is the degree of data inconsistency caused by the attack, and the detection rate is the probability that the attack will be detected by the defense model (penalty term). R1 is the attack reward for the currently executed attack behavior, and w1, w2, and w3 are the weights of the trigger rate, the data disorder, and the detection rate, respectively. w1 + w2 + w3 = 1. For example, w1 is 0.6, w2 is 0.2, and w3 is 0.2.

[0067] The reinforcement learning-based defense model is a reinforcement learning-based defense agent built for attack and defense exercise scenarios of target network systems. As the core unit of the defense-side decision-making and response of the attack and defense model, it can receive attack and defense test cases generated by generative models as input for basic defense strategies. In real-time confrontation with the reinforcement learning-based attack model, it can autonomously iterate and optimize the defense strategy through attack behavior perception, defense status evaluation, defense strategy decision-making, and defense effect reward feedback. This includes attack feature identification, alarm trigger threshold adjustment, interception timing selection, and dynamic updating of defense rules, so as to achieve adaptive protection against diverse attack behaviors and maximize defense rewards to achieve the optimal defense effect.

[0068] The input to a reinforcement learning-based defense model includes attack and defense test cases output by a generative model, the system state of the target network system, and historical attack and defense records. The output is a defense strategy for the target network system. The system state includes, but is not limited to, the component states, network state, and load conditions of the target network system. Historical attack and defense records include historical attack behaviors, attack effects, and historical defense strategies and their effects. The defense strategy includes defensive measures implemented in response to detected attack behaviors; for example, defensive measures for a distributed database cluster include enabling Quorum writes and adjusting master-slave switchover thresholds.

[0069] In the process of optimizing defense strategies through reinforcement learning, the defense reward for each step of the defense strategy is accumulated through a pre-defined defense reward function. This reward is used to guide the defense model to optimize the defense strategy and improve its effectiveness. Furthermore, the defense reward function is a quantization function used in the reinforcement learning-based defense model to evaluate the value of the defense strategy and output reward signals. By comprehensively scoring each defense strategy initiated by the reinforcement learning-based defense model in conjunction with attack and defense exercise performance indicators, positive / negative reward signals are output. Positive rewards guide the model to strengthen effective defense strategies (such as accurate attack interception and low false positive detection), while negative rewards suppress ineffective defense strategies (such as missed attacks and high-frequency false positives), thereby driving the defense model to iteratively optimize the output defense strategies (such as optimizing detection rules, alarm thresholds, and interception strategies).

[0070] For example, in a scenario where the target network system is a distributed database cluster, the defense reward function can be designed as: R2 = s1 × handover success rate - s2 × data latency - s3 × false alarm rate. Here, the handover success rate is the probability of a successful master-slave handover, the data latency is the data delay during the handover process (penalty), the false alarm rate is the number of false alarm attacks (penalty), R2 is the defense reward of the currently executed defense strategy, and s1, s2, and s3 are the weights of the handover success rate, data latency, and false alarm rate, respectively, with s1 + s2 + s3 = 1. For example, s1 is 0.7, s2 is 0.2, and s3 is 0.1.

[0071] For example, the attack model can use a GPT (Generative Pre-trained Transformer) model that integrates a fault injector, while the defense model can use a Long Short-Term Memory (LSTM-AnomalyDetect) network model based on anomaly detection.

[0072] Figure 2 This is a schematic diagram of a process for conducting attack and defense drills using a drill model in one embodiment of this disclosure. In some embodiments, such as... Figure 2 As shown, in step S103 above, the target network system is subjected to attack and defense exercises based on the attack and defense use case set through the exercise model, which may further include the following steps S201 to S204.

[0073] Step S201: Using the attack model, attack behaviors against the target network system are generated based on the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records.

[0074] The attack and defense test cases output by the generative model, the system state of the target network system, and historical attack and defense records are input into the reinforcement learning-based attack model. The attack action steps of the attack and defense test cases are used as the basic strategy, the system state of the target network system is used as the environmental awareness information, and the historical attack and defense records are used as the experience reference. Through the reinforcement learning policy network, state encoding, feature fusion, and action decision-making are performed to infer and generate attack behaviors that are oriented towards the target network system and adapted to the current attack and defense scenario and system state. The attack behaviors are executable attack command sequences.

[0075] Step S202: Inject the attack behavior into the target network system and carry out a network attack on the target network system.

[0076] The attack behavior output by the reinforcement learning-based attack model is injected into the target network system. The simulated network attack is carried out on the target network system according to the attack command sequence, simulating the attack actions under real attack scenarios, triggering the response of the target network system and defense mechanisms. During the attack, the state of the target network system is kept in real time and dynamically adapted to the attack rhythm. At the same time, the response of the target network system's own defense configuration and the reinforcement learning-based defense model is triggered, providing complete attack and defense interaction data for the calculation of subsequent attack and defense exercise effect indicators.

[0077] Step S203: Using the defense model, generate a defense strategy for the target network system based on the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records.

[0078] The attack and defense test cases output by the generative model, the system state of the target network system, and historical attack and defense records are input into the reinforcement learning-based defense model. The defense measures suggestions of the attack and defense test cases are used as the basic strategy, the system state of the target network system is used as the environmental awareness information, and the historical attack and defense records are used as the experience reference. Through the reinforcement learning policy network, state encoding, feature fusion, and policy decision-making are performed to infer and generate a defense strategy for the target network system that is adapted to the current attack and defense scenario and system state. The defense strategy is used to implement protective responses to the target network system. The defense strategy includes structured defense measures, which can cover attack feature identification rules, alarm trigger threshold adjustment schemes, interception timing and method selection, defense resource scheduling logic, etc. It can dynamically adapt to the attack behavior output by the attack model and provide an important basis for subsequent defense response execution.

[0079] Step S204: Implement defensive measures against the attack behavior on the target network system according to the defense strategy.

[0080] By analyzing the defense strategies output by the reinforcement learning-based defense model, these strategies are transformed into executable defense actions, enabling targeted defense measures against the target network system. These actions include, but are not limited to, attack detection, alarm triggering, traffic interception, and resource scheduling.

[0081] Against the attack command sequences injected by the attack model, layered defense measures can be implemented on the target network system. It can detect attack traffic and commands in real time, trigger precise alarms according to the adjusted threshold, perform targeted interception of attack behavior, and rationally allocate defense resources to focus on high-risk areas. At the same time, it can dynamically adapt to changes in attack behavior, collect data on the execution effect of defense measures in sync, and form a real-time adversarial closed loop with attack behavior, providing complete defense effect data support for the calculation of attack and defense exercise effect indicators.

[0082] In some embodiments, after completing an attack and defense exercise against a target network system using an attack and defense model, the exercise effect is comprehensively and quantitatively evaluated from multiple dimensions to obtain multi-dimensional attack and defense exercise effect indicators. In step S104 above, the obtained attack and defense exercise effect indicators include, but are not limited to: attack effect indicators, defense effect indicators, system impact indicators, behavior realism indicators, and automation degree indicators.

[0083] In terms of attack effectiveness, attack effectiveness metrics include, but are not limited to: attack success rate, attack chain completion, fault triggering depth, bypass detection rate, and attack time.

[0084] In terms of defense effectiveness, defense performance metrics include, but are not limited to: attack detection rate, attack interception rate, mean time to response (MTTR), false alarm rate, alarm accuracy, and defense strategy coverage.

[0085] In terms of system impact, system impact indicators include, but are not limited to: the impact of attack and defense drills on business availability, the impact on performance (SLO / SLI deviation), and resource consumption.

[0086] In terms of behavioral realism, behavioral realism indicators include, but are not limited to: the similarity between attack behavior and real threats (which can be compared through threat intelligence).

[0087] In terms of automation level, automation indicators include, but are not limited to, the automation ratio of use case generation, execution, monitoring, and evaluation.

[0088] For example, in terms of attack effectiveness, the metrics can be obtained statistically in the following ways: After the attack is executed, the detection system (metric monitoring service, Prometheus, etc.) checks whether the attack has achieved the expected failure effect, thereby calculating the attack success rate. Attack success rate = (number of successfully triggered attacks / total number of attacks) × 100%.

[0089] After the attack is executed, the execution status of the attack steps is tracked through the attack execution log and system status, thereby calculating the attack chain completion rate. Attack chain completion rate = (number of executed steps / total number of executed steps) × 100%.

[0090] After the attack is executed, the impact level of the attack on the system (application layer, data layer, network layer, etc.) is assessed based on system status detection indicators and log analysis. The fault trigger depth = number of affected layers / total number of layers.

[0091] After an attack is executed, the defense model detects logs and the alarm system counts the number of attacks that were not detected by the defense model, thereby calculating the bypass detection rate of the attack. The bypass detection rate is calculated as (number of attacks that were not detected / total number of attacks) × 100%.

[0092] After the attack is executed, the time from the start of the attack to the achievement of the expected effect is recorded based on the attack execution timestamp and the system state change timestamp, thereby calculating the attack time. Attack time = time to achieve the expected effect - attack start time.

[0093] For example, in terms of defense effectiveness, the metrics can be obtained statistically in the following ways: After implementing the defense strategy, the number of attacks successfully detected by the defense model is counted based on the detection logs and alarm records of the defense model, thereby obtaining the attack detection rate. Attack detection rate = (number of detected attacks / total number of attacks) × 100%.

[0094] After the defense strategy is implemented, the number of attacks successfully intercepted by the defense strategy is counted based on the defense strategy execution log and attack result record, thereby obtaining the attack interception rate. Attack interception rate = (number of successfully intercepted attacks / number of detected attacks) × 100%.

[0095] After implementing the defense strategy, the average time from detecting an attack to completing the defense is calculated by using the detection timestamp and the defense completion timestamp, thus obtaining the average response time (MTTR). MTTR = Σ(defense completion time - detection time) / number of attacks.

[0096] After implementing the defense strategy, the number of false alarms (attacks that were not actually attacks but were detected as attacks) is counted based on manual verification records and false alarm feedback. The false alarm rate is calculated as follows: False alarm rate = (number of false alarms / total number of detections) × 100%.

[0097] After implementing the defense strategy, the proportion of real attacks in the alarms is calculated based on the alarm records and attack verification results, thereby obtaining the alarm accuracy rate. Alarm accuracy rate = (number of real attack alarms / total number of alarms) × 100%.

[0098] After implementing the defense strategy, the proportion of attack types covered by the deployed defense strategy is calculated based on the defense strategy configuration and attack type classification. The defense strategy coverage is calculated as follows: Defense strategy coverage = (Number of attack types with defense strategy / Total number of attack types) × 100%.

[0099] For example, in the system impact dimension, the indicators can be obtained statistically in the following ways: During the exercise, service level objectives (SLOs) or service level indicators (SLIs) before and after the exercise are monitored using a business monitoring system (such as Prometheus or Datadog). The resulting SLO or SLI deviations are statistically analyzed, representing the impact of the exercise on service availability. The SLO deviation is calculated as: SLO deviation = (SLO during exercise - Normal SLO) / Normal SLO × 100%, and SLI deviation = (SLI during exercise - Normal SLI) / Normal SLI × 100%. For example, if the normal SLO is 99.9% and the SLO during the exercise is 99.5%, then the SLO deviation is (99.5% - 99.9%) / 99.9% = -0.4%.

[0100] During the exercise, APM tools (such as New Relic and AppDynamics) were used to monitor performance metrics (such as response time and throughput) before and after the exercise, thereby calculating the performance degradation rate. The performance degradation rate characterizes the impact of the exercise on performance, and is calculated as follows: Performance degradation rate = (Normal performance - Exercise performance) / Normal performance × 100%. For example, if the normal response time is 100ms and the response time during the exercise is 150ms, then the performance degradation rate = (100 - 150) / 100 = -50%.

[0101] During the exercise, the system's CPU, memory, network, and other resource usage during the exercise period is statistically analyzed through infrastructure monitoring (such as indicator monitoring services and Grafana). This allows us to calculate the resource consumption during the exercise period, which is calculated as: resource consumption = resource usage during the exercise period - normal resource usage.

[0102] For example, in the dimension of behavioral realism, the various metrics can be obtained statistically in the following ways: The attack behavior is matched with the threat intelligence in the threat intelligence database (such as MITRE ATT&CK and CVE database) to obtain the similarity between the attack behavior and the threat intelligence. The similarity represents the realism of the attack behavior. Similarity = number of matched ATT&CK techniques / total number of ATT&CK techniques.

[0103] For example, in terms of automation level, the metrics can be obtained statistically in the following ways: During the exercise, the number of automated executions at each stage was recorded and counted through execution logs and automation tools to obtain the automation ratio of the attack cases. The automation ratio = (number of automated executions / total number of executions) × 100%.

[0104] Figure 3This is a flowchart illustrating an optimization method for an attack and defense use case in one embodiment of this disclosure. In some embodiments, the attack and defense exercise effectiveness metrics include attack effectiveness metrics and defense effectiveness metrics, such as... Figure 3 As shown, in step S105 above, optimization guidance information is generated based on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model. This may further include the following steps S301 to S305.

[0105] Step S301: Based on the attack effect index, obtain the attack reward of the attack model through the attack reward function of the attack model.

[0106] For example, attack effectiveness metrics include: attack success rate for evaluating whether the attack has achieved the expected effect, bypass detection rate for evaluating the stealth of the attack, attack chain completion rate for evaluating the integrity of the attack, data disorder for evaluating the depth of the attack's impact, and attack time for evaluating the efficiency of the attack. The attack reward function can be designed as a function that describes the weighted summation of each attack effectiveness metric.

[0107] In some embodiments, the attack reward function may also include a penalty term. The penalty term for the attack behavior can be set according to the defense effect index. For example, the penalty term for the attack behavior can be set according to the attack detection rate in the defense effect index. The attack reward function of the attack model can be designed according to actual needs. One embodiment of this disclosure does not impose any special limitations on this.

[0108] After the attack and defense exercise, based on the attack and defense exercise effect indicators obtained from the exercise, the effectiveness, stealth, efficiency and other aspects of the attack behavior are comprehensively quantified by the attack reward function of the attack model. The attack reward obtained by the attack model in this attack and defense exercise is obtained. The attack reward serves as a feedback signal for the optimization of the attack model strategy and is used to update the attack model strategy.

[0109] Step S302: Based on the defense effect index, obtain the defense reward of the defense model through the defense reward function of the defense model. The optimization guidance information includes attack reward and defense reward.

[0110] For example, defense effectiveness metrics include: attack detection rate for evaluating the detection capability of the defense model, attack interception rate for evaluating the effectiveness of the defense strategy, mean response time (MTTR) for evaluating the response speed of the defense, false alarm rate for evaluating the accuracy of the defense model, and data latency for evaluating the impact of the defense on business. The defense reward function can be designed as a function that describes the weighted summation of each defense effectiveness metric.

[0111] In some embodiments, the defense reward function may also include a penalty term. For example, the false alarm rate and data latency indicators in the defense effect indicators may be set as penalty terms for the defense strategy executed by the defense model. The defense reward function of the defense model can be designed according to actual needs. One embodiment of this disclosure does not impose any special limitations on this.

[0112] After the attack and defense exercise, based on the attack and defense exercise effect indicators obtained from the exercise, the detection accuracy, interception effectiveness, response efficiency, etc. of the defense strategy are comprehensively quantified through the defense reward function of the defense model to obtain the defense reward obtained by the defense model in this attack and defense exercise. The defense reward serves as a feedback signal for the optimization of the defense model strategy and is used to update the defense model strategy.

[0113] Step S303: Based on the attack reward, iteratively optimize the attack behavior output by the attack model.

[0114] Based on the obtained attack rewards, the attack model based on reinforcement learning is updated with a strategy, and its output attack behavior is iteratively optimized to improve the effectiveness and stealth of the attack.

[0115] Specifically, the attack reward is used as a reinforcement learning feedback signal and input into the policy network of the attack model. By adjusting the model parameters, optimizing the attack path selection and payload construction logic, the attack behavior output by the attack model is iteratively optimized, making subsequent attacks easier to break through defenses and evade detection.

[0116] Step S304: Based on the defense reward, iteratively optimize the defense strategy output by the defense model.

[0117] Based on the obtained defense rewards, the reinforcement learning-based defense model is updated with a strategy, and its output defense strategy is iteratively optimized to improve the accuracy and adaptability of protection against attacks.

[0118] Specifically, defense rewards are used as reinforcement learning feedback signals and input into the policy network of the defense model. By adjusting model parameters, optimizing detection rules and interception logic, the defense strategy is iteratively optimized, making subsequent system defense more accurate and response more efficient.

[0119] Step S305: Based on the iteratively optimized attack behaviors and defense strategies, generate an optimized set of attack and defense test cases.

[0120] Based on the iteratively optimized attack behaviors and defense strategies, the attack and defense test cases of the input attack model and defense model can be iteratively optimized and updated, thereby optimizing and updating the attack and defense test case set generated by the generative model.

[0121] In some embodiments, in step S105 above, generating optimization guidance information based on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model may further include: determining optimization guidance information for the generative model based on the attack and defense exercise effect indicators; adjusting the generative model according to the optimization guidance information to generate an optimized attack and defense test case set.

[0122] Figure 4 This is a flowchart illustrating a method for determining optimized guidance information in one embodiment of the present disclosure. In some embodiments, such as... Figure 4 As shown, in step S105 above, the optimization guidance information for the generative model is determined based on the attack and defense exercise effect index, which may further include the following steps S401 to S403.

[0123] Step S401: Obtain the test case quality index based on the attack and defense exercise effect index statistics.

[0124] The quality metrics for use cases may include, but are not limited to, the following metrics: Test case success rate: The success rate of generated test cases in actual drills; Test case execution rate: The percentage of generated test cases that are actually executed; Use case effectiveness: The percentage of use cases that achieve the expected results; Use case coverage: The range of attack scenarios covered by the use cases; Fault Trigger Depth: The system impact level triggered by a test case; Bypass detection rate: The percentage of attacks generated from test cases that bypass defense detection; Business Impact: The degree to which use cases affect business SLOs / SLIs; Behavioral realism: The similarity between the attack behavior of the use case and the real threat; Test case generation time: The time required to generate test cases; Test case execution time: The time required to execute a test case.

[0125] Step S402: In response to the use case quality index meeting the optimization trigger condition, obtain the use case generation constraints corresponding to the use case quality index. The optimization guidance information includes the use case generation constraints corresponding to the use case quality index.

[0126] When one or more use case quality metrics meet the pre-set optimization trigger conditions, obtain the use case generation constraints used to optimize the quality of use cases generated by the generative model, and generate optimization guidance information corresponding to the use case quality metrics.

[0127] In some implementations, in step S105 above, the generative model is adjusted according to the optimization guidance information to generate an optimized set of attack and defense test cases, including: inputting test case generation optimization instructions into the generative model, the test case generation optimization instructions including optimization guidance information; and generating a new set of attack and defense test cases through the generative model based on the test case generation optimization instructions.

[0128] When one or more use case quality metrics meet the pre-set optimization trigger conditions, the use case generation constraints used to optimize the quality of use cases generated by the generative model are obtained as optimization guidance information. The corresponding use case generation optimization instructions are then input into the generative model. The use case generation optimization instructions include optimization guidance information, which further limits the scope and rules of use case generation by the generative model. This guides the generative model to avoid invalid or unsuitable use case features, improves the effectiveness and adaptability of generated use cases, and enhances the quality of generated use cases.

[0129] As a specific implementation method, the optimization process is implemented as follows: The system predefines a test case generation constraint template library. When the test case quality index meets the optimization trigger condition (e.g., the success rate of a certain type of attack test case is lower than a threshold), the optimization module retrieves the corresponding test case generation constraint template from the template library based on the trigger condition type (e.g., "low attack success rate") and the specific attack / defense test case type (e.g., "SQL injection"). This template is a text string, for example: "Please focus on generating SQL injection attack / defense test cases targeting [target system], including time-based blind injection and Boolean blind injection variants, requiring bypassing common WAF rules." Subsequently, the system concatenates this test case generation constraint string with the original exercise requirement information to form a new enhanced prompt word, which is then input into the generative model. The generative model generates the next attack / defense test case based on this new prompt word, thus reflecting the optimization orientation in the output.

[0130] In some embodiments, different use case generation constraints can be preset based on the different optimization triggering conditions that different use case quality indicators meet. When any one or more use case quality indicators meet the corresponding optimization triggering conditions, the corresponding use case generation constraints are obtained, use case generation optimization instructions are generated, and the use case generation optimization instructions are fed back to the generative model to instruct the generative model to further optimize the generated attack and defense use cases according to the use case generation constraints, thereby optimizing and updating the attack and defense use case set generated by the generative model.

[0131] In some embodiments, in step S402 above, generating optimization guidance information corresponding to the use case quality indicators in response to the use case quality indicators meeting the optimization triggering conditions may further include: in response to the use case quality indicators of any type of attack and defense use case meeting the attack optimization triggering conditions, determining the attack type of any type of attack and defense use case as the target attack mode; obtaining the use case generation constraints corresponding to the target attack mode, and the optimization guidance information including the use case generation constraints corresponding to the target attack mode.

[0132] For example, the test case quality metrics include test case success rate and bypass detection rate, and the attack optimization triggering conditions include a test case success rate greater than a first threshold (e.g., the first threshold is 0.8) and a bypass detection rate greater than a second threshold (e.g., the second threshold is 0.6).

[0133] When the success rate of any type of attack and defense test case is greater than the first threshold and the bypass detection rate is greater than the second threshold, it indicates that the attack success rate of this type of attack and defense test case is high and the concealment is high, making it difficult to be detected. Therefore, the attack type of this type of attack and defense test case is regarded as a high-value attack mode, i.e., the target attack mode.

[0134] In some embodiments, test case generation constraints corresponding to the target attack mode can be pre-configured. If the test case quality index of the attack and defense test case of this type meets the attack optimization triggering condition of the target attack mode, the test case generation constraints of the target attack mode can be directly obtained.

[0135] In some embodiments, the use case generation constraints for the target attack pattern received from user input are used as the use case generation constraints for the target attack pattern.

[0136] In some embodiments, targeted use case generation constraints can be generated based on the attack characteristics of the target attack pattern, such as attack path, payload construction, execution rhythm, concealment method, triggering conditions, etc.

[0137] The use case generation constraints for this target attack pattern describe the scope and rules for generating attack and defense use cases against this target attack pattern. Inputting these use case generation constraints into the generative model guides the generative model to adjust its generation strategy and improve the output of attack and defense use cases for this target attack pattern.

[0138] For example, the use case generation constraints include constraints that increase the generation weight of attack and defense use cases corresponding to the target attack mode and constraints that generate variant use cases of attack and defense use cases corresponding to the target attack mode, so as to instruct the generative model to increase the generation weight of attack and defense use cases corresponding to the target attack mode and generate variant use cases of attack and defense use cases corresponding to the target attack mode when generating attack and defense use cases.

[0139] In some embodiments, in step S402 above, generating optimization guidance information corresponding to the use case quality indicators in response to the use case quality indicators meeting the optimization triggering conditions may further include: in response to the use case quality indicators of any type of attack and defense use case meeting the effectiveness optimization triggering conditions, identifying any type of attack and defense use case as an inefficient attack and defense use case; determining the root cause of inefficiency of the inefficient attack and defense use case based on the execution log analysis; and obtaining the use case generation constraints corresponding to the inefficient attack and defense use case based on the root cause of inefficiency of the inefficient attack and defense use case, wherein the optimization guidance information includes the use case generation constraints corresponding to the inefficient attack and defense use case.

[0140] For example, the test case quality metrics include test case success rate and test case execution rate. The effectiveness optimization trigger conditions include test case success rate being less than a third threshold (e.g., the third threshold is 0.3) or test case execution rate being less than a fourth threshold (e.g., the fourth threshold is 0.5).

[0141] If the success rate of any type of attack and defense test case is less than the third threshold or the execution rate is less than the fourth threshold, it indicates that the success rate of this type of attack test case is low and the number of test cases that can be executed normally is small. Therefore, this type of attack and defense test case is identified as an inefficient attack and defense test case, and the root cause of the inefficiency of the inefficient attack and defense test case is determined by analyzing the execution log of the inefficient attack and defense test case.

[0142] Specifically, by analyzing attack and defense exercise logs, attack behavior records, and defense response data in the execution logs, the root causes of inefficiency in inefficient attack and defense test cases can be analyzed. For example, the root causes of inefficiency may be defects in the test cases themselves, incompatibility with the target network system components, full coverage by the defense strategy, execution logic conflicts, or execution timing errors.

[0143] In some embodiments, test case generation constraints corresponding to different inefficiency root causes can be pre-configured. After determining the inefficiency root cause of the inefficient attack and defense test case, the test case generation constraints corresponding to the inefficiency root cause can be directly obtained.

[0144] In some embodiments, the use case generation constraints for the inefficient root cause, which are received from user input, are used as the use case generation constraints corresponding to the inefficient attack and defense use case.

[0145] The test case generation constraints corresponding to the inefficient attack and defense test cases are used to describe the scope and rules for generating attack and defense test cases for the root cause of inefficiency. The test case generation constraints corresponding to the inefficient attack and defense test cases are input into the generative model to guide the generative model to adjust the generation strategy, avoid inefficient features, optimize test case adaptability, reduce the output of inefficient attack and defense test cases, and improve the overall effectiveness and executability of subsequent attack and defense test case sets.

[0146] For example, if the analysis determines that the root cause of inefficient attack and defense test cases is target component mismatch, then the test case generation constraints can include constraints for optimizing component selection logic; if the analysis determines that the root cause of inefficient attack and defense test cases is attack execution timing error, then the test case generation constraints can include constraints for optimizing attack step timing; furthermore, the test case generation constraints can also include constraints for reducing the generation weight of this type of inefficient attack and defense test cases.

[0147] In some embodiments, in step S402 above, generating optimization guidance information corresponding to the use case quality indicators in response to the use case quality indicators meeting the optimization triggering conditions may further include: in response to the use case quality indicators of any type of attack and defense use case meeting the defense optimization triggering conditions, determining the attack type of any type of attack and defense use case as the target defense blind spot; obtaining the use case generation constraints corresponding to the target defense blind spot, and the optimization guidance information including the use case generation constraints corresponding to the target defense blind spot.

[0148] For example, the use case quality metric includes the attack detection rate, and the defense optimization trigger condition includes the attack detection rate being less than a fifth threshold (e.g., the fifth threshold is 0.5).

[0149] When the attack detection rate of any type of attack and defense use case is less than the fifth threshold, it indicates that the attack use case of this type is highly concealed and difficult to detect. The defense model and the defense mechanism of the target network system lack effective detection capability for this type of attack. Therefore, the attack type of this type of attack and defense use case is identified as the target defense blind spot of the target network system, and the use case generation constraints corresponding to the target defense blind spot are obtained.

[0150] In some embodiments, use case generation constraints for target defense blind spots can be pre-configured. If the use case quality index of this type of attack and defense use case meets the defense optimization triggering condition of the target defense blind spot, the use case generation constraints of the target defense blind spot can be directly obtained.

[0151] In some embodiments, the use case generation constraints for the target defense blind spot, which are received from user input, are used as the use case generation constraints for the target defense blind spot.

[0152] The use case generation constraints corresponding to the target defense blind spot are used to describe the scope and rules for generating attack and defense use cases targeting the target defense blind spot. The use case generation constraints corresponding to the target defense blind spot are input into the generative model to guide the generative model to adjust the generation strategy and generate attack and defense use cases that are more covert, can effectively circumvent existing defense detection mechanisms, and whose attack targets are accurately aimed at the weak links in the system's defense. These use cases are used in subsequent attack and defense drills to continuously test, expose, and ultimately fix the system's defense blind spots.

[0153] For example, the constraints for generating use cases targeting the blind spots in the defense should include, but are not limited to: 1. The attack steps should be more covert to avoid triggering existing detection rules; 2. The attack target should be aimed at the weak points in the defense; 3. The attack sequence should avoid the defense detection window.

[0154] Figure 5 This is a flowchart illustrating another optimization method for attack and defense use cases in one embodiment of the present disclosure. In some embodiments, in step S105 above, optimization guidance information is generated based on the attack and defense exercise effect indicators to optimize the attack and defense use case set generated by the generative model. This may further include the following steps S501 to S504.

[0155] Step S501: Evaluate the index scores of each dimension of the attack and defense exercise effectiveness index.

[0156] For example, the effectiveness indicators of attack and defense exercises include multiple dimensions of effectiveness indicators, such as attack effectiveness indicators under the attack effectiveness dimension, defense effectiveness indicators under the defense effectiveness dimension, and system impact indicators under the system impact dimension.

[0157] In some embodiments, for each dimension's effectiveness metric, the metric score for that dimension's effectiveness metric is obtained by calculating the average of the statistically analyzed metrics under that dimension. For example, under the attack effectiveness dimension, attack effectiveness metrics include attack success rate, attack chain completion, fault triggering depth, bypass detection rate, etc. The metric score for that attack effectiveness dimension is obtained by calculating the average of the attack success rate, attack chain completion, fault triggering depth, and bypass detection rate.

[0158] Step S502: For each dimension of performance indicator, if the indicator score of that dimension of performance indicator is lower than the preset score, obtain the test case generation constraints corresponding to that dimension of performance indicator, and optimize the guidance information including the test case generation constraints corresponding to that dimension of performance indicator.

[0159] Each dimension's performance metric has a preset score. When the score of a dimension's performance metric is lower than the preset score, it is determined that the attack and defense test cases are not performing well in that dimension and there is room for optimization. Corresponding test case generation constraints are then generated to address the shortcomings and deficiencies of that dimension.

[0160] In some embodiments, different use case generation constraints are configured for cases where the scores of different dimension performance indicators are lower than preset scores; for each dimension performance indicator, in response to the fact that the score of that dimension performance indicator is lower than the preset score, the use case generation constraints corresponding to that dimension performance indicator are obtained from the mapping relationship between the dimension performance indicator and the use case generation constraints.

[0161] In some embodiments, user-inputted use case generation constraints for the performance metrics of that dimension are received and used as use case generation constraints for the performance metrics of that dimension.

[0162] For example, regarding the attack effectiveness dimension, if the score of this dimension's effectiveness indicator is lower than a preset score (e.g., the preset score is 0.6), it indicates that the generated attack and defense test cases are not effective enough, and a more complex attack chain needs to be generated. Therefore, for the case where the score of this dimension's effectiveness indicator is lower than the preset score, corresponding test case generation constraints can be configured, including but not limited to: 1. Increasing the complexity of the attack chain (multiple steps, multiple components); 2. Improving the attack success rate based on historical successful cases; 3. Enhancing the completeness of the attack chain to ensure that all steps are executable; 4. Adding reference context information, referencing the network topology, and selecting critical paths; 5. Considering firewall rules and designing bypass schemes.

[0163] For example, regarding the behavior realism dimension, if the score of this dimension's effectiveness indicator is lower than a preset score (e.g., the preset score is 0.6), it indicates that the generated attack and defense test cases lack sufficient behavioral realism. Threat intelligence can be referenced to generate more realistic attack behaviors that are closer to real threats. Therefore, for cases where the score of this dimension's effectiveness indicator is lower than the preset score, corresponding test case generation constraints can be configured, including but not limited to: 1. Referencing attack techniques from the MITRE ATT&CK framework; 2. Simulating real APT attack methods; 3. Increasing the stealth of attack behaviors.

[0164] Step S503: Input use case generation optimization instructions into the generative model. The use case generation optimization instructions include the use case generation constraints corresponding to the performance indicators of this dimension.

[0165] Step S504: Generate a new set of attack and defense test cases based on the use case generation optimization instructions using a generative model.

[0166] The test case generation constraints for this dimension's performance metrics describe the range and rules for generating attack and defense test cases when the performance metrics for this dimension are below a preset score. The test case generation constraints for this dimension's performance metrics below a preset score are input into the generative model, guiding the generative model to adjust the test case generation strategy according to the constraints, optimize the characteristics and logic of attack and defense test cases in this dimension, thereby improving the execution effect of subsequently generated attack and defense test cases in this dimension, and achieving comprehensive and balanced optimization of test case quality.

[0167] Figure 6 This is a schematic diagram of a scenario for attack and defense drills and optimization based on attack and defense use cases, as shown in one embodiment of this disclosure. Figure 6As shown, in some application scenarios, the target network system is a distributed database cluster. First, the AIGC model is used to generate initial attack and defense test cases. These initial test cases are then input into a reinforcement learning-based attack model and defense model. The attack model outputs attack behaviors based on the test cases, such as an AZ (Availability Zone) crash attack, and injects these behaviors into the target network system. The target network system's defense mechanism triggers a defense alarm. The defense model outputs defense strategies based on the test cases and writes these strategies into the target network system to implement defense measures, such as activating Quorum writes. During the attack and defense exercise, the system reports indicator data (such as master-slave switchover time, data differences, etc.) for indicator evaluation. The module includes an evaluation module that collects attack and defense exercise performance metrics. Based on the attack performance metrics, it obtains attack rewards for the attack model and defense rewards for the defense model. It then feeds back the attack reward R1 to the attack model and the defense reward R2 to the defense model. The attack model iteratively optimizes its output attack behavior based on the attack rewards, and the defense model iteratively optimizes its output defense strategy based on the defense rewards. Simultaneously, the evaluation module also calculates test case quality metrics based on the attack and defense exercise performance metrics. Based on these metrics, it further generates test case generation constraints and feeds back test case generation optimization instructions to the AIGC model to guide it in further optimizing the generated attack and defense test cases.

[0168] In some embodiments, in response to detecting that the attack detection rate of any type of attack and defense use case is too high, such as the attack detection rate being higher than the detection threshold, the indicator evaluation module requests the AIGC model to generate attack and defense use cases with more covert attack behaviors through use case generation optimization instructions. The more covert attack behaviors are, for example, low-rate, long-cycle network attack behaviors (or slow attacks).

[0169] In some application scenarios, such as actual distributed database clusters, the metrics and indicators required for attack and defense drills can be evaluated from the following three dimensions: real-time comparison of the number of master-slave difference records through the data stream processing service as an indicator of data consistency; tracking the time from master node change to recovery write through the link tracing service as an indicator of switching reliability; and obtaining the increase in 503 error rate through the indicator monitoring service as an indicator of business impact.

[0170] Based on the metrics obtained from the evaluation of each dimension, the test case generation strategy of the AIGC model is dynamically adjusted. For example, in the data consistency dimension, when the number of master-slave difference records is greater than the metric threshold (e.g., the metric threshold is 100), the test case generation optimization instruction can be used to provide the AIGC model with a test case generation constraint that increases the weight of the split-brain protection test, so as to further constrain the attack and defense test cases generated by the AIGC model and optimize the data consistency dimension metric in the attack and defense exercise of the distributed database cluster. In the switch reliability dimension, when the master node switch time is greater than the metric threshold (e.g., the metric threshold is 10 seconds), the distributed database fault detection threshold can be optimized through the feedback defense model to optimize the defense strategy output by the defense model, thereby optimizing the switch reliability dimension metric in the attack and defense exercise of the distributed database cluster.

[0171] In some embodiments, the performance indicators obtained from the evaluation can be fed back to the system or platform (e.g., security operations platform) used for security management and response. The optimized defense model can output actionable defense strategy recommendations, vulnerability remediation priority lists, detection rule optimization recommendations, security configuration hardening schemes, etc., to directly improve the actual security protection level of the target network system.

[0172] In one embodiment of this disclosure, an attack and defense test case generation apparatus is also provided. This apparatus is used to implement the above embodiments and preferred embodiments, and details already described will not be repeated. As used below, the term "module" can be a combination of software and / or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.

[0173] One embodiment of this disclosure provides an attack and defense test case generation device. Figure 7 This is a structural block diagram of an attack and defense test case generation device provided in one embodiment of the present disclosure, such as... Figure 7 As shown, the attack and defense test case generation device includes: a requirement acquisition module 701, which acquires the exercise requirement information of the target network system; a test case generation module 702, which generates an attack and defense test case set for attack and defense exercises against the target network system based on the exercise requirement information using a generative model; an attack and defense exercise module 703, which conducts attack and defense exercises against the target network system based on the attack and defense test case set using an exercise model; an indicator evaluation module 704, which acquires the attack and defense exercise effect indicators of the target network system; and a test case optimization module 705, which generates optimization guidance information based on the attack and defense exercise effect indicators to optimize the attack and defense test case set generated by the generative model.

[0174] The attack and defense test case generation apparatus provided in one embodiment of this disclosure can execute the attack and defense test case generation method provided in any embodiment of this disclosure, and has the corresponding functional modules and beneficial effects of the execution method. Further functional descriptions of the above modules and units are the same as those in the corresponding embodiments described above, and will not be repeated here.

[0175] Figure 8 This is a schematic diagram of the structure of an electronic device provided in one embodiment of the present disclosure.

[0176] The following is a detailed reference. Figure 8 This diagram illustrates a structural schematic suitable for implementing an electronic device according to one embodiment of the present disclosure. The electronic device may include a processor (e.g., a central processing unit, graphics processor, etc.) 801, which can perform various appropriate actions and processes according to a program stored in read-only memory (ROM) 802 or a program loaded from memory 808 into random access memory (RAM) 803. RAM 803 also stores various programs and data required for the operation of the electronic device. The processor 801, ROM 802, and RAM 803 are interconnected via bus 804. Input / output (I / O) interface 805 is also connected to bus 804.

[0177] Typically, the following devices can be connected to I / O interface 805: input devices 806 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 807 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; memory devices 808 including, for example, magnetic tapes, hard disks, etc.; and communication devices 809. Communication device 809 allows electronic devices to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 8 Electronic devices with various devices are shown, but it should be understood that it is not required to implement or have all of the devices shown, and more or fewer devices may be implemented or have instead.

[0178] In particular, according to embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of this disclosure include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 809, or installed from a memory 808, or installed from a ROM 802. When the computer program is executed by the processor 801, it performs the functions defined in the attack and defense use case generation method of one embodiment of this disclosure.

[0179] Figure 8The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of any embodiment of this disclosure.

[0180] One embodiment of this disclosure also provides a computer-readable storage medium in which the method described in one embodiment of this disclosure can be implemented in hardware or firmware, or implemented as computer code that can be recorded on a storage medium, or implemented as computer code downloaded over a network and originally stored on a remote storage medium or a non-transitory machine-readable storage medium and then stored on a local storage medium. Thus, the method described herein can be processed by software stored on a storage medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware. The storage medium can be a magnetic disk, optical disk, read-only memory, random access memory, flash memory, hard disk, or solid-state drive, etc.; further, the storage medium can also include combinations of the above types of memory. It is understood that a computer, processor, microprocessor controller, or programmable hardware includes storage components capable of storing or receiving software or computer code, which, when accessed and executed by the computer, processor, or hardware, implements the attack and defense use case generation method shown in the above embodiments.

[0181] A portion of this disclosure can be applied to computer program products, such as computer program instructions, which, when executed by a computer, can invoke or provide methods and / or technical solutions according to this disclosure through the operation of the computer. Those skilled in the art will understand that the forms in which computer program instructions exist in a computer-readable medium include, but are not limited to, source files, executable files, and installation package files. Accordingly, the ways in which computer program instructions are executed by a computer include, but are not limited to: the computer directly executing the instructions; the computer compiling the instructions and then executing the corresponding compiled program; the computer reading and executing the instructions; or the computer reading and installing the instructions and then executing the corresponding installed program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium accessible to a computer.

[0182] Although embodiments of the present disclosure have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the present disclosure, and such modifications and variations all fall within the scope defined by the appended claims.

Claims

1. A method for generating attack and defense test cases, characterized in that, The method includes: Obtain the exercise requirements information of the target network system; Based on the exercise requirements information, a set of attack and defense test cases is generated using a generative model to conduct attack and defense exercises against the target network system. The target network system is subjected to attack and defense exercises using the exercise model and the set of attack and defense use cases. Obtain the attack and defense exercise performance indicators of the target network system; Based on the attack and defense exercise effectiveness indicators, optimization guidance information is generated to optimize the attack and defense test case set generated by the generative model.

2. The method according to claim 1, characterized in that, The step of conducting attack and defense drills on the target network system using the drill model and the set of attack and defense test cases includes: generating and executing attack behaviors and defense strategies against the target network system based on the set of attack and defense test cases and the system state of the target network system using the drill model.

3. The method according to claim 2, characterized in that, The attack and defense use case set includes at least one attack and defense use case, and the exercise model includes an attack model based on reinforcement learning and a defense model based on reinforcement learning. The step involves generating and executing attack behaviors and defense strategies against the target network system based on the attack and defense test case set and the system state of the target network system through a drill model, including: The attack model generates attack behaviors against the target network system based on the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records. The attack behavior is injected into the target network system to carry out a network attack on the target network system; Based on the defense model, a defense strategy for the target network system is generated according to the attack and defense use cases in the attack and defense use case set, the system state of the target network system, and historical attack and defense records. According to the defense strategy, defensive measures are implemented on the target network system to counter the attack.

4. The method according to claim 3, characterized in that, The attack and defense exercise effectiveness metrics include attack effectiveness metrics and defense effectiveness metrics. The step of generating optimization guidance information based on these metrics to optimize the attack and defense test case set generated by the generative model includes: Based on the attack effect index, the attack reward of the attack model is obtained through the attack reward function of the attack model; Based on the defense effectiveness index, the defense reward of the defense model is obtained through the defense reward function of the defense model, and the optimization guidance information includes the attack reward and the defense reward; Based on the attack reward, the attack behavior output by the attack model is iteratively optimized; Based on the defense reward, the defense strategy output by the defense model is iteratively optimized; Based on the iteratively optimized attack behaviors and defense strategies, an optimized set of attack and defense test cases is generated iteratively.

5. The method according to claim 1, characterized in that, The step of generating optimization guidance information based on the attack and defense exercise effectiveness indicators to optimize the attack and defense test case set generated by the generative model includes: Based on the aforementioned attack and defense exercise effectiveness indicators, optimization guidance information for the generative model is determined; The generative model is adjusted based on the optimization guidance information to generate an optimized set of attack and defense test cases.

6. The method according to claim 5, characterized in that, The step of determining optimization guidance information for the generative model based on the attack and defense exercise effectiveness indicators includes: The test case quality index is obtained based on the statistical analysis of the attack and defense exercise effectiveness index; In response to the use case quality metric meeting the optimization trigger condition, optimization guidance information corresponding to the use case quality metric is generated, and the optimization guidance information includes use case generation constraints for the generative model.

7. The method according to claim 5, characterized in that, The step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization trigger condition includes: If the quality index of any type of attack and defense use case meets the attack optimization trigger condition, the attack type of the attack and defense use case of that type is determined as the target attack mode. Obtain the test case generation constraints corresponding to the target attack mode, and the optimization guidance information includes the test case generation constraints corresponding to the target attack mode.

8. The method according to claim 5, characterized in that, The step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization trigger condition includes: If the quality index of any type of attack and defense test case meets the effectiveness optimization trigger condition, then the attack and defense test case of any type is identified as an inefficient attack and defense test case. The root cause of inefficiency in the inefficient attack and defense test cases is determined by analyzing the execution logs of the inefficient attack and defense test cases. Based on the root cause of the inefficiency of the inefficient attack and defense test cases, obtain the test case generation constraints corresponding to the inefficient attack and defense test cases, and the optimization guidance information includes the test case generation constraints corresponding to the inefficient attack and defense test cases.

9. The method according to claim 5, characterized in that, The step of generating optimization guidance information corresponding to the use case quality indicator in response to the use case quality indicator meeting the optimization trigger condition includes: If the quality index of any type of attack and defense use case meets the defense optimization trigger condition, the attack type of the attack and defense use case of that type is identified as the target defense blind spot. Obtain the use case generation constraints corresponding to the target defense blind spot, and the optimization guidance information includes the use case generation constraints corresponding to the target defense blind spot.

10. An electronic device, characterized in that, include: A memory and a processor are interconnected, the memory stores computer instructions, and the processor executes the attack and defense use case generation method according to any one of claims 1 to 9 by executing the computer instructions.