A hypergraph-based attack detection and tracing method and system

By replacing the origin graph with a hypergraph structure and combining kernel log compression with the ATT&CK attack and defense matrix, the problems of massive data volume and time consumption in APT network attack detection are solved, achieving efficient attack detection and tracing.

CN116846594BActive Publication Date: 2026-06-23ZHEJIANG UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHEJIANG UNIV OF TECH
Filing Date
2023-05-26
Publication Date
2026-06-23

Smart Images

  • Figure CN116846594B_ABST
    Figure CN116846594B_ABST
Patent Text Reader

Abstract

The application discloses an attack detection and tracing method and system based on a hypergraph, and the method comprises the following steps: collecting a kernel log, and performing compression processing to obtain a first origin graph; performing path matching according to an ATT&CK attack and defense matrix, matching a hyperedge in the first origin graph, and constructing a hypergraph based on the hyperedge; matching each hyperedge in the hypergraph based on attack behaviors obtained based on expert experience, marking the hyperedge that is successfully matched as malicious behavior, and performing a tracing operation according to the hyperedge. The application converts the origin graph by using the structure of the hypergraph, matches the attack behaviors through the hyperedge, improves the detection efficiency, and can trace out the initial attack entry node.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of network security technology, specifically involving a method and system for constructing a behavioral hypergraph based on Windows kernel logs and performing network attack detection and tracing. Background Technology

[0002] APT (Aggressive Persistent Threat) attacks are planned and persistent cyberattacks targeting governments, core infrastructure, and critical industries. Compared to traditional cyberattacks, APT attacks are characterized by high stealth, long incubation periods, and diverse attack methods, making traditional network traffic-based detection methods ineffective. Therefore, comprehensive monitoring of system behavior to detect the actual destructive behavior of APT attacks is one of the effective means to combat the stealth and diversity of APT attacks. However, existing analysis methods rely on Windows' ETW (Entry Point Warp) tool to collect kernel logs and construct origin graphs for analysis and detection. But because kernel logs fully reflect upper-layer user behavior, the amount of log data generated is enormous, making the construction of origin graphs difficult and time-consuming due to the sheer volume of data.

[0003] Hypergraphs are a natural extension of graphs, allowing an edge to connect any number of vertices. This enables the representation of higher-order relationships involving multiple entities and provides a more accurate framework for handling diverse relationships. Compared to general graph structures, hypergraphs offer advantages in clustering processes, are more flexible in handling multimodal and heterogeneous data, and facilitate the fusion and expansion of multimodal data.

[0004] Hypergraph structures are a more generalized type of relationship than traditional pairwise graphs. Over the past few decades, hypergraph theory has proven effective in solving numerous real-world problems. The powerful expressive capabilities of hypergraphs enable efficient modeling of various networks, data structures, process scheduling, and systems involving object relationships. Theoretically, hypergraphs can generalize certain theorems from ordinary graphs, and even a single hypergraph theorem can replace several theorems from traditional graphs. From a practical perspective, hypergraph structures are becoming increasingly popular than traditional graph structures. Therefore, using hypergraph structures to replace origin graphs can effectively address the problem of massive data volumes, achieving an indirect compression effect. This invention proposes a method and system for attack detection and tracing based on hypergraphs. Summary of the Invention

[0005] One of the objectives of this invention is to provide an attack detection and tracing method based on a hypergraph. This method utilizes the structure of a hypergraph to transform the origin graph, matches attack behaviors through hyperedges, improves detection efficiency, and can trace the initial entry node of the attack.

[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0007] A hypergraph-based attack detection and tracing method, comprising:

[0008] Step 1: Collect kernel logs and compress them to obtain the first origin map;

[0009] Step 2: Perform path matching based on the ATT&CK attack and defense matrix, match the hyperedges in the first origin graph, and construct the hypergraph based on the hyperedges;

[0010] Step 2-1: Extract entities and behaviors from each TTP in the ATT&CK attack and defense matrix to form a second origin graph, and traverse the second origin graph to extract behavior paths;

[0011] Step 2-2: Traverse the first origin graph to extract the first path, match each first path with each behavior path, take the successfully matched first path as a hyperedge, and construct a hypergraph based on the hyperedge;

[0012] Step 3: Using attack behaviors based on expert experience, match each hyperedge in the hypergraph, mark the successfully matched hyperedges as malicious behaviors, and perform source tracing operations based on the hyperedges.

[0013] Several alternative methods are provided below, but they are not intended as additional limitations on the overall solution above. They are merely further additions or optimizations. Provided there are no technical or logical contradictions, each alternative method can be combined individually with respect to the overall solution above, or multiple alternative methods can be combined with each other.

[0014] As a preferred method, the kernel log collection is performed using the Windows-based kernel log tracing framework ETW.

[0015] Preferably, the compression process to obtain the first origin map includes:

[0016] Each kernel log entry is transformed into a triple <subject, operation, object>, where the subject and object are the entities in the kernel log, and the operation is the behavior in the kernel log;

[0017] The collected kernel logs are transformed into a first origin graph based on triples, where the vertices in the first origin graph are entities and the edges in the first origin graph are operations.

[0018] Merge edges with the same operation between two vertices using a causal relationship-preserving reduction method;

[0019] Take the first origin graph after merging using the causal relationship retention reduction method, merge multiple edges between two vertices in the first origin graph that differ only in timestamp, retain only the edge with the earliest timestamp after merging, and add a frequency attribute to the retained edges. This frequency attribute is used to represent the number of edges between two vertices that differ only in timestamp before merging, thus obtaining the final first origin graph.

[0020] Preferably, matching each first path with each behavior path includes:

[0021] Calculate the similarity between the current first path and the i-th action path:

[0022]

[0023]

[0024] S i =αN i +βP i

[0025] In the formula, N i Let N be the number of nodes similar to the current first path and the i-th action path. l N represents the number of nodes that are the same in the first path and the i-th action path. total P represents the total number of nodes on the current first path. i Let P be the similarity of the operands between the current first path and the i-th action path. l P represents the number of identical operations in the current first path and the i-th action path. total S represents the total number of operations on the current first path. i Let α be the similarity between the current first path and the i-th action path, β be the node weight, and α + β = 1.

[0026] Take the maximum similarity between the current first path and all behavioral paths. If the maximum similarity is greater than the matching threshold, the current first path is marked as a successful match; otherwise, the current first path is marked as a failed match.

[0027] Preferably, a behavior summary is assigned to each successfully matched first path. The behavior summary includes the attributes Node, TID, TimeStamp, and Path. Node represents the intrusion node, which is the process name of the kernel log corresponding to the first path. TID represents the technology ID in the TTP that successfully matches the first path. TimeStamp represents the timestamp, which is the latest timestamp among all vertices in the first path. Path represents the behavior path of the TTP that successfully matches the first path.

[0028] Preferably, an attack behavior based on expert experience is used to match each hyperedge in the hypergraph, including:

[0029] Construct a hyperedge time sequence table based on the TID and TimeStamp attributes of all hyperedges in the hypergraph;

[0030] We extract attack behaviors based on expert experience and analyze the technical sequence within those behaviors.

[0031] If the order of techniques in the attack behavior is the same as the order of the corresponding techniques in the superedge time sequence table, it means that the attack behavior is successfully matched. Then, the technique that is the same as the attack behavior in the superedge time sequence table is taken, and the superedge corresponding to the taken technique is marked as a malicious behavior.

[0032] Preferably, the source tracing operation based on the hyperedge includes:

[0033] Take multiple superedges that successfully match a given attack behavior;

[0034] The ingress node of each hyperedge is obtained based on the Node property of the hyperedge;

[0035] The data transmission direction between multiple cut-in nodes is determined based on the Path attribute in the hyperedge, and the cut-in node located at the source of the data transmission direction is taken as the initial cut-in node to complete the source tracing operation.

[0036] The second objective of this invention is to provide a hypergraph-based attack detection and tracing system, including a processor and a memory storing a number of computer instructions, wherein the computer instructions, when executed by the processor, implement the steps of the hypergraph-based attack detection and tracing method.

[0037] The present invention provides a method and system for attack detection and attribution based on hypergraphs, which has the following advantages compared with the prior art:

[0038] 1) Log merging and compression can significantly reduce data volume and improve storage efficiency. 2) Constructing a hypergraph using attribute structures can effectively compress log data. 3) Using a hypergraph structure for hyperedge attack detection can greatly improve detection efficiency and also trace the initial entry point of the attack. Attached Figure Description

[0039] Figure 1 This is a flowchart of an attack detection and tracing method based on hypergraphs according to the present invention;

[0040] Figure 2 This is a schematic diagram illustrating an embodiment of the kernel log data merging process of the present invention;

[0041] Figure 3This is a flowchart illustrating the construction process of the hypergraph in this invention;

[0042] Figure 4 This is a schematic diagram illustrating an embodiment of the process by which a phishing attack forms a behavioral path according to the present invention;

[0043] Figure 5 This is a schematic diagram illustrating the path matching of an attack behavior according to the present invention;

[0044] Figure 6 This is a schematic diagram of an embodiment of the present invention for tracing sources through behavior summaries. Detailed Implementation

[0045] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0046] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to limit the invention.

[0047] To overcome the problem that traditional network traffic-based detection methods are unable to effectively detect APT network attacks, this embodiment proposes a hypergraph-based attack detection and tracing method.

[0048] like Figure 1 As shown in this embodiment, an attack detection and tracing method based on hypergraphs includes the following steps:

[0049] (1) Based on the Windows system kernel log tracing framework ETW, kernel logs are collected and compressed to filter out kernel logs with extremely high similarity or no key information, forming the first origin graph.

[0050] (1-1) Use the Windows-based kernel log tracing framework ETW to collect kernel logs and store them in local files.

[0051] (1-2) Since the kernel log is a complete reflection of upper-layer user behavior, the collected data includes information such as event type, process ID, process name, parent process ID, parent process name, thread ID, timestamp, path, host identifier, private attribute pairs, etc., and there is a large amount of data with high similarity (i.e., an operation will generate multiple logs in the kernel log that are the same but have different timestamps). Therefore, the collected kernel log needs to be further compressed.

[0052] First, convert the kernel log into a triplet format, i.e., < <subject> , <operation> , <object>The kernel log is represented by a graph consisting of three vertices: a subject, an operation, and an object. The subject and object are entities such as processes, files, and networks, while the operation is a behavior such as reading, writing, creating, or deleting. The kernel log is then transformed into an origin graph using triples, where vertices represent entities and edges represent operations. This method of constructing the origin graph mitigates the problem of massive kernel log data and reduces the difficulty and time required for its construction.

[0053] To further simplify the origin graph, this embodiment utilizes the Causality Preserved Reduction method to merge edges between two vertices that have the same operation. This merging operation aggregates events with the same degree of impact; that is, when multiple different events affect the same entity and the result of this impact is the same, only the event with the earliest timestamp (i.e., one edge) is retained. Compression using this method does not lose the causal semantics of the kernel log.

[0054] (1-3) Finally, to ensure the integrity of frequency semantics while compressing, when multiple data entries have the same event type, process ID, process name, and path but different timestamps, from the perspective of causal semantic analysis, these data entries represent the same behavior and can be merged. That is, this embodiment merges multiple edges between two vertices in the first origin graph that differ only in timestamps. After merging, only the edge with the earliest timestamp is retained, and a frequency attribute is added to the retained edges. This frequency attribute represents the number of edges between two vertices that differ only in timestamps before merging.

[0055] like Figure 2 As shown, when three data entries have the same event type, process ID, process name, and path, but differ only in timestamp, a `times` attribute is added to the end of each data entry to represent the frequency of operations within the kernel, recording the number of times the same action occurs and preventing semantic loss. The merged timestamp is the smallest of the three data entries, and the `times` attribute is changed to 3. This ensures relative semantic integrity while compressing the data.

[0056] (2) Path matching is performed based on the ATT&CK attack and defense matrix to match hyperedges in the first origin graph, and a hypergraph is constructed based on these hyperedges. For example... Figure 3 As shown, it includes the following steps.

[0057] (2-1) Because the types of user operations are different, the attributes of each kernel log collected are different. For example, operations on files include file path, file handle, etc.; operations on processes include memory address, parent process information, etc. Therefore, it is necessary to reduce the attributes and retain only some attributes. The attributes retained in this embodiment are as follows: event type, process ID, process name, timestamp, path, and frequency.

[0058] For example < <eventname:fileioread> , <processid:8188>

[0059] <ProcessName:explorer.exe>, <timestamp:133162612228607427> , <times:3>>

[0060] (2-2) Generate behavioral paths based on TTP (Tactics, Techniques, Procedures) in ATT&CK, and use them as the criteria for superedge construction.

[0061] Based on the text descriptions of each TTP in the ATT&CK attack and defense matrix, the entities and behaviors are manually extracted from these descriptions to form a second origin graph. Alternatively, the entities and behaviors in the text descriptions of each TTP can be extracted using NLP or other methods to form a second origin graph. The paths formed by traversing the second origin graph are then extracted, which are the behavior paths.

[0062] like Figure 4 The diagram illustrates the process of forming a behavioral path in technology T1566.001. Process P1 creates a branch process P2, represented by "xxx_P1->xxx_P2". Process P2 performs send / receive operations on a socket S and write operations on another folder F, represented by "xxx_P2<->xxx_S,->xxx_F". The "->" and "<-" in the path indicate the direction of information flow, and the comma (,) connects multiple branches of an entity. For example, "P1->F1,->F2" indicates that P1 has information flow to both F1 and F2.

[0063] (2-3) Obtain the path information of entities and edges in the first origin graph, match it with the behavioral paths of different TTPs, aggregate the hyperedges, and further construct the hypergraph.

[0064] The first path is extracted by traversing the first origin graph. In this embodiment, a random walk is performed in the first origin graph to obtain path information as the first path. The first step is then matched with the behavioral paths of each TTP based on similarity. The paths are matched using the triplet form mentioned in steps (1-2), i.e., < <subject> , <operation> , <object>>, where "subject" and "object" are nodes, and "operation" is an operation. The similarity with the nodes and operations of each TTP behavior path is calculated using formulas (1) and (2), and then the similarity with each TTP is calculated using formula (3). The one with the highest similarity and greater than 80% (which is the matching threshold and can be adjusted) is considered a successful match.

[0065]

[0066]

[0067] S i =αN i +βP i

[0068] In the formula, N i Let N be the number of nodes similar to the current first path and the i-th action path. l N represents the number of nodes that are the same in the first path and the i-th action path. total P represents the total number of nodes on the current first path. i Let P be the similarity of the operands between the current first path and the i-th action path. l P represents the number of identical operations in the current first path and the i-th action path. total S represents the total number of operations on the current first path. i Let α be the similarity between the current first path and the i-th action path, β be the node weight, and α + β = 1.

[0069] Take the maximum similarity between the current first path and all behavioral paths. If the maximum similarity is greater than the matching threshold, the current first path is marked as a successful match; otherwise, the current first path is marked as a failed match.

[0070] If a match is successful, a behavior summary corresponding to the first path is assigned. This behavior summary contains the attributes Node, TID, TimeStamp, and Path. Node represents the entry node of the first path, taken from the process name in the kernel log corresponding to the first path. TID represents the technology ID in the TTP that successfully matches the first path. TimeStamp represents the timestamp, which is the latest timestamp among all vertices in the path. Path represents the behavior path of the TTP that successfully matches the first path, containing the entire occurrence process of this behavior.

[0071] The successfully matched behavior summaries are aggregated according to time, and nodes with the same behavior summaries form a hyperedge. The first successfully matched path is used as the hyperedge to eventually form a hypergraph (or attribute hypergraph).

[0072] (3) Using attack behaviors based on expert experience, match each hyperedge in the hypergraph constructed in (2), and mark the attack chains that match the ATT&CK attack order as malicious behaviors, and perform source tracing operations.

[0073] (3-1) The ATT&CK framework contains 14 tactics, each of which consists of many techniques. Furthermore, executing a complete attack may not necessarily utilize all tactics, so attacks cannot be detected in tactical order. This embodiment utilizes the TID and TimeStamp attributes of the behavior summary in step (2) to construct a hyperedge-based time sequence table. For example, Table 1 shows a hyperedge-based time sequence table constructed from a hypergraph.

[0074] Table 1. Time Sequence Table of Hyperedges

[0075] TimeStamp 12 23 31 55 … 873 1223 … Technology ID T1566 T1059 T1547 T1195 … T1555 T1559 … tactics TA0001 TA0002 TA0003 TA0001 … TA0006 TA0002 …

[0076] (3-2) The time sequence table obtained in (3-1) is used to perform rule matching on the superedges based on the attack behavior obtained from expert experience. If the TTP order in the time sequence table matches the order of the attack behavior, the matching is successful.

[0077] In this embodiment, the technical order in the attack behavior is taken based on expert experience. If the technical order in the attack behavior is the same as the order of the corresponding technology in the super-edge time sequence table, it means that the attack behavior is successfully matched. Then, the technology in the super-edge time sequence table that is the same as the attack behavior is taken, and the super-edge corresponding to the taken technology is marked as a malicious behavior.

[0078] It's important to clarify that "sameness" here doesn't mean the technologies are necessarily adjacent, but rather that they appear in the same order. For example, attacks that use phishing emails to control computers, based on expert experience, perfectly fit the ATT&CK model, which consists of four technologies: T1566, T1059, T1547, and T1529. Figure 5 (Each column in the diagram represents a technique within a tactic.) This represents an attack path derived from expert experience. The attacker uses technique T1566 to send phishing messages to lure users into clicking, then uses technique T1059 to execute a script via command, followed by technique T1547 to add the program to the startup folder or use a registry key to reference it for persistence, and finally technique T1529 to shut down or restart the system, thus completing a full attack. If the sequence of techniques T1566, T1059, T1547, and T1529 also exists in the super-edge time sequence table, it indicates a successful match with this attack behavior.

[0079] (3-3) When a complete attack chain is matched, the source can be traced using the behavior summary in the hyperedge label. The behavior summary can reconstruct the approximate behavior of each vertex entity within each hyperedge. In this embodiment, when performing the source tracing operation, multiple hyperedges that successfully match an attack behavior are selected; the entry node of each hyperedge is obtained according to the Node attribute in the hyperedge; the data transmission direction between multiple entry nodes is determined according to the Path attribute in the hyperedge, and the entry node located at the source of the data transmission direction is taken as the initial entry node to complete the source tracing operation.

[0080] Using the example Path given in steps (2-3) as a description, the origin graph is reconstructed and traced back to its source, such as... Figure 6 As shown, "malware_P->Invoke-WebRequest_P" in the Path indicates that a new Invoke-WebRequest process is created for the malicious process, and "Invoke-WebRequest_P<->https / / github.com / redcanaryco

[0081] The line ` / atomic-redteam / raw / master / atomics / T1566.001 / bin / PhishingAttachment.xlsm_S,->$env TEMP\PhishingAttachment.xlsm_F` indicates that the Invoke-WebRequest process performs send / receive operations on one socket and write operations on another file. Figure 6 The entry point for the operation is malware_P. By restoring multiple digest information, the initial entry point of the malicious behavior can be inferred, thereby achieving the purpose of tracing the source.

[0082] In another embodiment, this application also provides a hypergraph-based attack detection and tracing system, including a processor and a memory storing a plurality of computer instructions, wherein the computer instructions, when executed by the processor, implement the steps of the hypergraph-based attack detection and tracing method.

[0083] For specific limitations on hypergraph-based attack detection and attribution systems, please refer to the limitations on hypergraph-based attack detection and attribution methods mentioned above, which will not be repeated here.

[0084] The memory and processor are electrically connected directly or indirectly to enable data transmission or interaction. For example, these components can be electrically connected to each other via one or more communication buses or signal lines. The memory stores a computer program that can run on the processor, which implements the method in the embodiments of the present invention by running the computer program stored in the memory.

[0085] The memory may be, but is not limited to, Random Access Memory (RAM), Read Only Memory (ROM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), etc. The memory stores the program, and the processor executes the program upon receiving an execution instruction.

[0086] The processor may be an integrated circuit chip with data processing capabilities. The aforementioned processor can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this invention. The general-purpose processor can be a microprocessor or any conventional processor.

[0087] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0088] The embodiments described above are merely illustrative of several implementations of the present invention, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of the invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these modifications and improvements all fall within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the appended claims.< / object> < / operation> < / subject> < / timestamp:133162612228607427> < / eventname:fileioread> < / object> < / operation> < / subject>

Claims

1. A method for attack detection and tracing based on hypergraphs, characterized in that, The aforementioned hypergraph-based attack detection and tracing method includes: Step 1: Collect kernel logs and compress them to obtain the first origin map, including: Each kernel log entry is transformed into a triple <subject, operation, object>, where the subject and object are the entities in the kernel log, and the operation is the behavior in the kernel log; The collected kernel logs are transformed into a first origin graph based on triples, where the vertices in the first origin graph are entities and the edges in the first origin graph are operations. Merge edges with the same operation between two vertices using a causal relationship-preserving reduction method; Take the first origin graph after merging using the causal relationship retention reduction method, merge multiple edges between two vertices in the first origin graph that differ only in timestamp, retain only the edge with the earliest timestamp after merging, and add a frequency attribute to the retained edges. This frequency attribute is used to represent the number of edges between two vertices that differ only in timestamp before merging, thus obtaining the final first origin graph. Step 2: Perform path matching based on the ATT&CK attack and defense matrix, match the hyperedges in the first origin graph, and construct the hypergraph based on the hyperedges; Step 2-1: Extract entities and behaviors from each TTP in the ATT&CK attack and defense matrix to form a second origin graph, and traverse the second origin graph to extract behavior paths; Step 2-2: Traverse the first origin graph to extract the first path, match each first path with each behavior path, take the successfully matched first path as a hyperedge, and construct a hypergraph based on the hyperedge; Step 3: Using attack behaviors based on expert experience, match each hyperedge in the hypergraph, mark the successfully matched hyperedges as malicious behaviors, and perform source tracing operations based on the hyperedges.

2. The attack detection and tracing method based on hypergraphs as described in claim 1, characterized in that, Kernel logs are collected using the Windows-based kernel log tracing framework ETW.

3. The attack detection and tracing method based on hypergraphs as described in claim 1, characterized in that, The matching of each first path with each behavior path includes: Calculate the current first path and the first path Similarity of behavioral paths: In the formula, For the current first path and the first The similarity of the number of nodes in each behavioral path. For the current first path and the first The number of identical nodes in each behavior path This represents the total number of nodes in the current first path. For the current first path and the first Operand similarity of each behavioral path For the current first path and the first The number of identical operations in each behavior path This represents the total number of operations on the current first path. For the current first path and the first The similarity of individual behavioral paths For node weights, For operation weights, and ; Take the maximum similarity between the current first path and all behavioral paths. If the maximum similarity is greater than the matching threshold, the current first path is marked as a successful match; otherwise, the current first path is marked as a failed match.

4. The attack detection and tracing method based on hypergraphs as described in claim 1, characterized in that, Assign a behavior summary to each successfully matched first path. The behavior summary includes the attributes Node, TID, TimeStamp and Path. Node represents the intrusion node, which is the process name of the kernel log corresponding to the first path. TID represents the technology ID in the TTP that successfully matches the first path. TimeStamp represents the timestamp, which is the latest timestamp among all vertices in the first path. Path represents the behavior path of the TTP that successfully matches the first path.

5. The attack detection and tracing method based on hypergraphs as described in claim 4, characterized in that, The attack behavior, based on expert experience, is used to match each hyperedge in the hypergraph, including: Construct a hyperedge time sequence table based on the TID and TimeStamp attributes of all hyperedges in the hypergraph; We extract attack behaviors based on expert experience and analyze the technical sequence within those behaviors. If the order of techniques in the attack behavior is the same as the order of the corresponding techniques in the superedge time sequence table, it means that the attack behavior is successfully matched. Then, the technique that is the same as the attack behavior in the superedge time sequence table is taken, and the superedge corresponding to the taken technique is marked as a malicious behavior.

6. The attack detection and tracing method based on hypergraphs as described in claim 4, characterized in that, The source tracing operation based on the hyperedge includes: Take multiple superedges that successfully match a given attack behavior; The ingress node of each hyperedge is obtained based on the Node property of the hyperedge; The data transmission direction between multiple cut-in nodes is determined based on the Path attribute in the hyperedge, and the cut-in node located at the source of the data transmission direction is taken as the initial cut-in node to complete the source tracing operation.

7. A hypergraph-based attack detection and tracing system, comprising a processor and a memory storing a plurality of computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the steps of the attack detection and tracing method based on hypergraphs as described in any one of claims 1 to 6.