A feature rule-based malicious traffic detection method
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HAOHAN DATA
- Filing Date
- 2024-12-03
- Publication Date
- 2026-06-23
AI Technical Summary
The existing feature-based malicious traffic detection engine has an overly simplistic pre-filtering mode that fails to maximize performance advantages and cannot be adjusted in real time, resulting in poor detection results.
We construct a rule-based divide-and-conquer chain, a multi-modal intensive graph, and a precise detection chain. By extracting and sorting the features of rule options, we optimize the detection process and achieve multi-modal filtering and dynamic adjustment.
It improves the detection performance of the front-end malicious traffic detection engine, enabling timely detection of suspicious traffic and rapid removal of normal traffic, thereby enhancing the detection effectiveness of network security products.
Smart Images

Figure CN119561765B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of traffic detection technology, and more specifically to a method for detecting malicious traffic based on feature rules. Background Technology
[0002] With the continuous development and application of network technology, while promoting social development and progress, various network security attacks have also brought about substantial economic losses. In order to quickly detect network attacks, provide timely warnings and block attacks, and avoid economic losses caused by network attacks, various network security detection products currently typically employ a front-end feature rule-based malicious traffic detection engine to identify a small number of suspicious attack traffic, and then combine this with a back-end detection engine such as a machine learning engine to accurately detect attack behaviors in network traffic.
[0003] Existing feature-based malicious traffic detection engines typically employ the following two implementation methods:
[0004] The first method is based on the fingerprint characteristics of malicious traffic, generating corresponding regular expressions or feature string rules, and using Hyperscan to match and detect all rules with traffic load.
[0005] The second approach uses a pre-filtering method that adds a fixed port and a single sub-feature to filter out some detection rules, thereby reducing the number of actual rule detections and improving the performance of the detection engine, such as open-source engines like Suricata.
[0006] These two methods typically have the following drawbacks:
[0007] The pre-filtering mode is too simplistic. It can only select one option feature from the option rules as the filtering condition, preventing other option features with high detection performance from participating in pre-filtering and significantly reducing its effectiveness. Furthermore, the filtering mode is fixed. Once the rules are compiled and detected, the rule filtering module, regardless of its performance, will not adjust in real time, failing to maximize the advantages of the pre-filtering module. Summary of the Invention
[0008] In view of the shortcomings of existing technologies, the purpose of this invention is to provide a malicious traffic detection method based on feature rules.
[0009] To achieve the above objectives, the present invention provides the following technical solution:
[0010] A malicious traffic detection method based on feature rules, specifically including the following steps:
[0011] The malicious traffic detection feature rules are extracted one by one, their format is validated, and relevant rule option information is extracted.
[0012] Construct a rule divide-and-conquer chain based on the relevant rule option information, and group rules with the same divide-and-conquer chain path into rules;
[0013] Based on the rule groups, select some rule options for each rule to construct a multimodal reduced graph for the corresponding group;
[0014] Based on the multimodal condensed graph, a precise detection chain for each rule is constructed;
[0015] The data to be detected extracted after deep decoding of network traffic is input into the rule divide-and-conquer chain to find the corresponding multimodal condensed graph. Then, based on the detection result of the multimodal condensed graph, it is input into the corresponding precise detection chain to obtain the detection result.
[0016] Select traffic flows that match the feature rules from the detection results and input them into the post-detection engine for further identification and detection.
[0017] In this invention, preferably, the option features are fingerprint features contained in malicious traffic, and the same rule contains multiple option features.
[0018] In this invention, preferably, the rule divide-and-conquer chain is constructed in the order of destination port divide-and-conquer group, protocol divide-and-conquer group, and detection direction divide-and-conquer group.
[0019] In this invention, preferably, the target port divide-and-conquer group expands all the values of the port into an array to establish several port arrays.
[0020] In this invention, preferably, the protocol divide-and-conquer group expands all the values of the protocol into arrays to establish several protocol arrays.
[0021] In this invention, preferably, the construction rule grouping specifically includes:
[0022] According to the requirements of each divide-and-conquer group, the three data sources of destination port, protocol and detection direction of each rule are extracted in sequence as divide-and-conquer information. According to the requirements of each divide-and-conquer group, the position of each divide-and-conquer group is found based on the extracted divide-and-conquer information to form the divide-and-conquer chain of the corresponding rule.
[0023] Rules along the same divide-and-conquer chain path are combined to form corresponding rule groups.
[0024] In this invention, preferably, constructing a multimodal condensed graph specifically includes:
[0025] Within the same rule group, several rule options are selected based on the confidence levels of different types of rule options, serving as the initial source of rule options for constructing the multimodal condensed graph;
[0026] Based on the rule options selected by each rule in the same rule group, the frequency of use of the rule options is statistically analyzed, and the two types of rule options with the highest frequency of use are selected as the data source for constructing the multimodal condensed graph;
[0027] Based on the two selected rule options, each rule selects its corresponding option and puts it into a multimodal condensed graph. The multimodal condensed graph extracts option information based on each rule and generates a corresponding rule bitmap set.
[0028] In this invention, preferably, the confidence level of the rule options is determined by setting different confidence levels for various rule options based on the frequency of fingerprint usage by actual malicious traffic, thus forming a rule option confidence table.
[0029] In this invention, preferably, the specific steps of constructing the precise detection chain for each rule are to sort each rule option of each rule according to its confidence level and then put them into the precise detection chain list in sequence to form a precise detection chain.
[0030] In this invention, preferably, the specific steps for obtaining the detection result include:
[0031] Perform deep decoding of network traffic to extract the data to be detected;
[0032] The data to be tested is sequentially processed through a divide-and-conquer chain, a multimodal condensed graph, and a precise detection chain to obtain the final rule-based detection result.
[0033] Compared with the prior art, the beneficial effects of the present invention are:
[0034] The method of this invention is applied in systems such as network security monitoring that require attack detection, early warning and blocking of network data. It improves the detection performance of the front-end feature rule-based malicious traffic detection engine, solves the problem of poor detection effect of network security products caused by insufficient performance of the front-end feature rule-based detection engine, and achieves the effect of timely detection of suspicious traffic and faster elimination of normal traffic. Attached Figure Description
[0035] Figure 1 This is a flowchart illustrating a malicious traffic detection method based on feature rules as described in this invention.
[0036] Figure 2 This is a schematic diagram of the divide-and-conquer chain construction process described in this invention.
[0037] Figure 3 This is a schematic diagram of the multimodal condensed graph construction process described in this invention.
[0038] Figure 4 This is a schematic diagram of the precise detection chain construction process described in this invention. Detailed Implementation
[0039] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0040] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and / or" as used herein includes any and all combinations of one or more of the associated listed items.
[0041] Explanation of related terms:
[0042] Essential features: The detection rule contains seven parts of information: protocol, source IP, destination IP, source port, destination port, direction of the detected traffic, and rule ID.
[0043] Option features: Option features are the fingerprint features contained in malicious traffic. A single rule can contain multiple option features.
[0044] Rule options: Rule options are the smallest feature units of a rule, containing required features plus option features.
[0045] Option confidence: Option confidence is set based on the frequency of each option feature participating in the detection in the traffic from big data statistics. The higher the confidence, the greater the probability that the option will participate in the detection.
[0046] Data processing module: In network monitoring and network security detection systems, a separate module is usually required to deeply decode the data in the traffic, extract the data to be detected, and send it to the detection engine for detection.
[0047] Please also see Figures 1 to 4 This invention provides a preferred embodiment of a feature-based malicious traffic detection method to improve the detection performance of a front-end feature-based malicious traffic detection engine. This addresses the problem of poor detection results in network security products due to insufficient performance of the front-end feature-based detection engine. The method can be used to quickly and promptly detect small amounts of suspicious malicious attack traffic based on real-time network traffic data, providing it for use by back-end detection engines such as machine learning detection. It enables accurate identification, alerting, and blocking of malicious attack traffic. The specific steps include:
[0048] S1. Extract the malicious traffic detection feature rules one by one, perform format verification, and extract relevant rule option information;
[0049] S2. Construct a rule divide-and-conquer chain based on the relevant rule option information, and group rules with the same divide-and-conquer chain path into rules;
[0050] S3. Based on the rules, group the data, select some rule options for each rule, and construct a multimodal condensed graph for the corresponding group.
[0051] S4. Based on the multi-modal condensed graph, construct a precise detection chain for each rule;
[0052] S5. Input the data to be detected extracted after deep decoding of network traffic into the rule divide-and-conquer chain, find the corresponding multimodal condensed graph, and then input the detection result of the multimodal condensed graph into the corresponding precise detection chain to obtain the detection result.
[0053] S6. Select the traffic that matches the feature rules from the detection results and input them into the post-detection engine for further identification and detection.
[0054] In this embodiment, the option features are the fingerprint features contained in malicious traffic, and the same rule contains multiple option features.
[0055] Specifically, in step S1, the original rule decoding mainly involves loading, verifying, and decoding the essential and optional features in each feature rule according to the rule format, and extracting the rule information required to build the detection engine.
[0056] The extraction of rule option information for a single rule consists of two parts: essential features and optional features. The essential features include seven parts: the protocol of the traffic to be detected, the source IP, the destination IP, the source port, the destination port, the direction of the traffic to be detected, and the rule ID. The optional features are the fingerprint features contained in malicious traffic. A single rule can contain multiple optional features.
[0057] Specifically, in step S2, the rule divide-and-conquer chain is constructed in the order of destination port divide-and-conquer group, protocol divide-and-conquer group, and detection direction divide-and-conquer group. The destination port divide-and-conquer group expands all the values of the port into arrays to create several port arrays. The protocol divide-and-conquer group expands all the values of the protocol into arrays to create several protocol arrays.
[0058] According to the requirements of each divide-and-conquer group, the destination port, protocol, and detection direction of each rule are extracted sequentially as divide-and-conquer information. Based on the extracted divide-and-conquer information, the positions of the corresponding divide-and-conquer groups are located, forming the divide-and-conquer chain for each rule. Rules along the same divide-and-conquer chain path are combined to form corresponding rule groups. Divid-and-conquer chains are constructed based on essential rule features to achieve rapid layer-by-layer filtering of traffic. Furthermore, the port divide-and-conquer groups in the divide-and-conquer chain use an array of size 65536, which reduces the search time complexity to O(1) in actual detection compared to the traditional port group linked list.
[0059] In this embodiment, the port divide-and-conquer group expands all possible values of the port into an array and establishes an array of size 65536, reducing the time complexity to O(1) to improve the accuracy of the target port hit in rule detection.
[0060] In this implementation, the protocol divide-and-conquer group expands all possible values of the protocol into an array and creates an array of size 256, reducing the time complexity to O(1) to improve the hit accuracy of the rule detection protocol.
[0061] Specifically, in step S3, constructing the multimodal reduced graph includes:
[0062] Within the same rule group, several rule options are selected based on the confidence levels of different types of rule options, serving as the initial source of rule options for constructing the multimodal condensed graph;
[0063] Based on the rule options selected by each rule in the same rule group, the frequency of use of the rule options is statistically analyzed, and the two types of rule options with the highest frequency of use are selected as the data source for constructing the multimodal condensed graph;
[0064] Based on the two selected rule options, each rule selects its corresponding option and places it into a multi-modal condensed graph. The multi-modal condensed graph then extracts option information from each rule to generate a corresponding rule bitmap set. The use of multi-modal condensed graphs avoids the limitation that the current feature-based malicious traffic detection engine can only select one option feature as a multi-modal filtering constraint. By using multiple option features to form a condensed graph, the multi-modal filtering effect is further enhanced.
[0065] In this implementation, the confidence level of rule options is determined by assigning different confidence values to various rule options based on the frequency of fingerprint usage by actual malicious traffic, thus forming a rule option confidence table. The application of this confidence table avoids the limitation of current feature-based malicious traffic detection engines, which can only select multi-mode option features or build precise detection chains based on the order of option features in the rules or the configuration order in the configuration file. Instead, a confidence table can be constructed based on the detection frequency of option features collected from different user environments, using big data analysis. By selecting option features and optimizing the detection order based on this confidence table, the system can more accurately identify the target option features, achieving more timely detection of suspicious traffic and faster removal of legitimate traffic.
[0066] Specifically, in step S4, the steps for constructing the precise detection chain for each rule include sorting each rule option of each rule according to its confidence level and then placing them into the precise detection chain list in sequence to form the precise detection chain.
[0067] Specifically, in step S5, the steps to obtain the detection result include:
[0068] Perform deep decoding of network traffic to extract the data to be detected;
[0069] The data to be detected is sequentially processed through a divide-and-conquer chain, a multi-modal condensed graph, and a precise detection chain to obtain the final rule detection result. The precise detection chain is constructed based on the confidence of the option features, moving option features with high confidence forward to promptly detect and filter out rules that do not conform to the features. This avoids the situation where the current feature-based malicious traffic detection engine can only construct the precise detection chain based on the order of option features in the rule, regardless of the performance of the option features.
[0070] Specifically, in step S6, the traffic hit by the feature rule engine is sent to the machine learning engine and other post-detection engines for further identification and detection, and alarm information is output based on the detection results.
[0071] In this implementation, the detection order of each option feature in the multi-modal condensed graph can be continuously adjusted according to the frequency of use of each option feature in the real-time detection process. This avoids the drawback that the current feature rule malicious traffic detection engine cannot be dynamically adjusted regardless of its performance during the detection process. It can dynamically adjust the performance of the detection engine according to different traffic environments to achieve the optimal detection state.
[0072] In some other preferred embodiments of the present invention, a computer-readable storage medium is provided storing a computer program that, when executed by a processor, causes the processor to perform the steps of the method as described in the above embodiments.
[0073] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, essentially, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0074] The above description is a detailed description of the preferred embodiments of the present invention. However, the embodiments are not intended to limit the scope of the patent application of the present invention. All equivalent changes or modifications made under the technical spirit of the present invention should fall within the patent scope covered by the present invention.
Claims
1. A malicious traffic detection method based on feature rules, characterized in that, The specific steps include: The malicious traffic detection feature rules are extracted one by one, their format is validated, and relevant rule option information is extracted. A rule divide-and-conquer chain is constructed based on relevant rule option information. The rule divide-and-conquer chain is constructed in the order of destination port divide-and-conquer group, protocol divide-and-conquer group, and detection direction divide-and-conquer group to achieve rapid layer-by-layer filtering of traffic. Rules with the same divide-and-conquer chain path are grouped into rules. Based on the rule grouping, select some rule options of each rule to construct a multimodal condensed graph of the corresponding group. The multimodal condensed graph generates a corresponding rule bitmap set based on the option information extracted from each rule, which is used to enhance the multimodal filtering effect. Based on the multimodal condensed graph, the rule options of each rule are sorted according to their confidence level to construct a precise detection chain for each rule; The data to be detected extracted after deep decoding of network traffic is input into the rule divide-and-conquer chain to find the corresponding multimodal condensed graph. Then, based on the detection result of the multimodal condensed graph, it is input into the corresponding precise detection chain to obtain the detection result. Select traffic flows that match the feature rules from the detection results and input them into the post-detection engine for further identification and detection.
2. The malicious traffic detection method based on feature rules according to claim 1, characterized in that, The option features are the fingerprint features contained in malicious traffic, and the same rule contains multiple option features.
3. The malicious traffic detection method based on feature rules according to claim 2, characterized in that, The rule divide-and-conquer chain is constructed in the order of destination port divide-and-conquer group, protocol divide-and-conquer group, and detection direction divide-and-conquer group.
4. The malicious traffic detection method based on feature rules according to claim 3, characterized in that, The destination port divide-and-conquer group expands all the values of the port into an array, creating a port array of size 65536.
5. The malicious traffic detection method based on feature rules according to claim 3, characterized in that, The protocol divide-and-conquer group expands all the values of the protocol into arrays, creating several protocol arrays.
6. The malicious traffic detection method based on feature rules according to claim 3, characterized in that, The construction rule group specifically includes: According to the requirements of each divide-and-conquer group, the three data sources of destination port, protocol and detection direction of each rule are extracted in sequence as divide-and-conquer information. According to the requirements of each divide-and-conquer group, the position of each divide-and-conquer group is found based on the extracted divide-and-conquer information to form the divide-and-conquer chain of the corresponding rule. Rules along the same divide-and-conquer chain path are combined to form corresponding rule groups.
7. The malicious traffic detection method based on feature rules according to claim 1, characterized in that, Constructing a multimodal condensed graph specifically includes: Within the same rule group, several rule options are selected based on the confidence levels of different types of rule options, serving as the initial source of rule options for constructing the multimodal condensed graph; Based on the rule options selected by each rule in the same rule group, the frequency of use of the rule options is statistically analyzed, and the two types of rule options with the highest frequency of use are selected as the data source for constructing the multimodal condensed graph; Based on the two selected rule options, each rule selects the corresponding option, puts it into the multimodal condensed graph, and generates the corresponding rule bitmap set.
8. The malicious traffic detection method based on feature rules according to claim 7, characterized in that, The confidence level of the rule options is determined by setting different confidence levels for each type of rule option based on the frequency of fingerprint usage by actual malicious traffic, thus forming a rule option confidence table.
9. The malicious traffic detection method based on feature rules according to claim 1, characterized in that, The specific steps for constructing the accurate detection chain for each rule are as follows: sort each rule option of each rule according to its confidence level, and then put them into the accurate detection chain list in sequence to form the accurate detection chain.
10. The malicious traffic detection method based on feature rules according to claim 1, characterized in that, The specific steps to obtain the test results include: Perform deep decoding of network traffic to extract the data to be detected; The data to be tested is sequentially processed through a divide-and-conquer chain, a multimodal condensed graph, and a precise detection chain to obtain the final rule-based detection result.