Industrial internet vulnerability verification method
By constructing a protocol template library and an experience replay pool, and combining multi-agent division of labor with mutation strategies, test cases conforming to industrial protocol specifications are generated and causal attribution analysis is performed. This solves the problems of low efficiency, insufficient accuracy, and poor interpretability in existing vulnerability verification technologies, and achieves efficient and reliable vulnerability verification and remediation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 北京中关村实验室
- Filing Date
- 2026-04-24
- Publication Date
- 2026-06-19
Smart Images

Figure CN122247740A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of industrial internet security technology, and specifically relates to an industrial internet vulnerability verification method. Background Technology
[0002] As the core carrier for the deep integration of industrial systems and the internet, the security and stability of the Industrial Internet's equipment clusters directly affect the continuity and reliability of industrial production. Vulnerability verification, as a core pre-emptive step in the Industrial Internet security protection system, proactively detects the vulnerabilities of target devices and protocols, providing a basis for vulnerability remediation and protection strategy formulation. It is a key means to resist cyberattacks and ensure the safety of industrial production.
[0003] Currently, vulnerability verification in the Industrial Internet primarily relies on fuzz testing. This technology generates a massive number of diverse test cases and inputs them into the target system to monitor device response characteristics and identify potential vulnerabilities. However, existing fuzz testing technologies face several technical bottlenecks when applied in Industrial Internet scenarios, specifically as follows:
[0004] 1. Lack of protocol semantic awareness in test case generation: Industrial control protocols such as Modbus / TCP, S7comm, and OPCUA in the industrial internet field have strict semantic constraints and field specifications. Existing fuzz testing adopts a general mutation strategy and does not customize the design for the field semantics, value range, and dependency relationships of industrial control protocols. As a result, a large number of test cases are directly rejected by the receiving end because they violate the protocol specifications. The proportion of invalid test cases is too high, which seriously reduces the efficiency of vulnerability verification.
[0005] 2. Disconnect between strategy optimization and feedback mechanism: During the verification process, data such as abnormal information and field mutation effects from device response feedback are not effectively integrated and utilized. There is a lack of adaptive learning and strategy iteration mechanisms. The test case generation strategy always relies on fixed rules and cannot dynamically adjust the mutation direction based on historical test results. It is difficult to accurately focus on high-risk fields and protocol vulnerabilities, resulting in insufficient targeting of vulnerability detection.
[0006] 3. Poor reproducibility of verification results: Industrial Internet devices are diverse in type and firmware version. The same vulnerability can manifest in significantly different ways on different devices and in different environments. Existing verification methods only record the final abnormal results and do not fully retain key verification states such as test cases, message sequences, and session contexts. This makes it impossible to trace and reproduce the vulnerability triggering process, which is not conducive to subsequent vulnerability analysis and remediation.
[0007] 4. High-dimensional state space exploration is inefficient and costly to deploy: Industrial Internet protocol fields are complex and device state space is high-dimensional. Existing methods either rely on a large number of manually labeled vulnerability samples for model training or use complex models such as deep neural networks for state exploration. Not only is the cost of acquiring labeled data high and the cycle long, but the computational overhead of complex models is also huge, making it difficult to adapt to the resource-constrained deployment environment of industrial sites.
[0008] 5. Lack of interpretability of verification results: Most fuzzing methods adopt a black-box detection mode, which can only output a qualitative result of "an anomaly exists" and cannot clarify the specific cause of the anomaly and the related fields. This "black-box" verification result is difficult to support security analysts to carry out root cause analysis of vulnerabilities, and cannot provide equipment manufacturers with accurate basis for vulnerability remediation.
[0009] To address the shortcomings of existing technologies, there is an urgent need for an industrial internet vulnerability verification solution that balances verification efficiency, interpretability, and engineering feasibility, in order to adapt to the semantic characteristics of industrial internet protocols and improve the accuracy and feasibility of vulnerability detection. Summary of the Invention
[0010] To address the aforementioned problems in existing technologies—namely, the low efficiency, insufficient accuracy, poor engineering feasibility, and inconvenience in vulnerability verification due to issues such as lack of protocol semantic awareness in test cases, blind and unsupported strategy selection, poor reproducibility of verification results, low efficiency and high deployment cost in exploring high-dimensional state spaces, and lack of interpretability of verification results—this invention proposes an industrial internet vulnerability verification method, comprising the following steps:
[0011] A protocol template library is constructed based on publicly available protocol specifications, and field sequences are obtained by parsing the original network packets based on the protocol template library.
[0012] The statistical features of the field sequence are calculated as the current scene feature vector. The statistical features include field value distribution entropy, field variation historical response rate, and field type identifier.
[0013] Retrieve historical scene features similar to the current scene feature vector from the experience replay pool, calculate the basic confidence upper bound value based on the historical statistical information of each mutation strategy, and weight and fuse the basic confidence upper bound value with the similar scene experience score to dynamically select the mutation strategy. The similar scene experience score is calculated based on the retrieved historical scene features.
[0014] The seed packet is mutated according to the selected mutation strategy to generate test cases, which are then sent to the target device for verification, and the response result from the target device is obtained; wherein, the seed packet is selected from the original network packet;
[0015] Update the historical statistics of each mutation strategy and the experience replay pool based on the response results of the target device;
[0016] When the target device responds abnormally, active intervention experiments are performed on the fields that have mutated in the test case in sequence. The causal effect of each field is judged by comparing the responses, and a verification record containing causal attribution results is generated.
[0017] Based on the verification records, a reproduction package and reproduction instructions are generated for manual verification and confirmation of the vulnerability. The beneficial effects of this invention are:
[0018] 1) By constructing a coupled architecture with message field semantic parsing through a protocol template library, the defects of traditional fuzzy testing without protocol awareness are eliminated, and accurate field representation and test case generation that conform to industry protocol specifications are achieved. This fundamentally solves the problems of test cases being discarded due to violations and the high proportion of invalid tests, and significantly improves the accuracy and efficiency of vulnerability verification.
[0019] 2) Taking experience playback retrieval and UCB fusion strategy selection as the core, historical scenario experience is transformed into dynamic decision-making basis, breaking the limitations of traditional strategy selection that is blind and lacks experience support, avoiding repeated trials, and greatly improving the efficiency of anomaly triggering and strategy adaptability.
[0020] 3) Construct a layered and coupled strategy selection, attribution analysis, and reproducibility verification mechanism. Combine multi-agent division of labor with a meta-knowledge base cold start / reset architecture to avoid the shortcomings of traditional high-dimensional space exploration, such as low efficiency, poor reproducibility, and high deployment cost. Rely on proactive intervention attribution to improve the interpretability of results, ensure the convenience of vulnerability analysis and repair, and achieve efficient and reliable implementation in engineering scenarios. Attached Figure Description
[0021] Other features, objects, and advantages of this application will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings:
[0022] Figure 1 This is a flowchart illustrating the overall steps of an industrial internet vulnerability verification method according to the present invention.
[0023] Figure 2 This is a flowchart illustrating the steps involved in generating a verification record containing causal attribution results for an industrial internet vulnerability verification method according to the present invention. Detailed Implementation
[0024] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the invention. Furthermore, it should be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings.
[0025] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. This application will now be described in detail with reference to the accompanying drawings and embodiments.
[0026] To more clearly illustrate the industrial internet vulnerability verification method of the present invention, the following will be combined with... Figures 1 to 2 The steps in the embodiments of the present invention will be described in detail below.
[0027] The first embodiment of this invention is an industrial internet vulnerability verification method, see [link to relevant documentation]. Figure 1 This includes the following steps:
[0028] S1. Construct a protocol template library based on publicly available protocol specifications, and parse the original network packets to obtain a field sequence based on the protocol template library;
[0029] In this embodiment, the publicly available protocol specifications include mainstream protocols such as Modbus / TCP, S7comm, Ethernet / IP, Profinet, and OPC UA. Other protocols are processed using a general binary fuzzy approach. Specifically, a list of protocols in the industrial internet field, industry application statistics, and publicly available protocol standard documents are obtained to define the scope of mainstream protocols: the five most widely used core protocols—Modbus / TCP, S7comm, Ethernet / IP, Profinet, and OPC UA—are selected and defined as the core protocol set. Other non-core protocols (such as DNP3, IEC61850, etc.) are explicitly marked and subsequently processed using a general binary fuzzy method (extracting common basic fields of non-core protocols, such as packet header identifiers, length fields, and checksum fields; performing basic parsing based on the "general field parsing rules" in the protocol template library; and then performing binary fuzzy mutation on the data field fields), outputting a list of core protocols:
[0030] The results of marking non-core protocols focus on the core protocols most widely used in industrial scenarios, constructing targeted templates to improve parsing efficiency and accuracy; at the same time, it also takes into account non-core protocols to ensure the universality of the method and avoid the omission of vulnerabilities due to incomplete protocol coverage.
[0031] For each core protocol The protocol specifications were collected, and the protocol message format was parsed using a combination of manual and automated methods to extract structured information, including: fixed header fields, function code fields, data field fields, checksum fields and the legal value range of each field, and field dependencies (e.g., the dependency between the length field and the data field is defined as "length field value = data field length + fixed offset 2"). This information was then integrated into a protocol template. ,in, For the agreement The set of fields, For the legal constraints of each field, To address field dependencies, templates for all core protocols are integrated to build a protocol template library. This clarifies the constraints and relationships of each field, providing a basis for subsequent message field parsing and mutation strategy customization, ensuring the accuracy of the parsing results and the rationality of the mutation operation, and avoiding invalid mutations.
[0032] To remove redundancy, standardize the format, reduce the computational load of subsequent parsing, and improve process efficiency, raw network packets from the target device are collected using industrial network packet capture tools (such as Wireshark and tcpdump). The collection scope covers normal communication scenarios (such as device startup, data interaction, and status feedback) and potential abnormal scenarios (such as network fluctuations and command error triggers) of the target device, with a collection duration of no less than 24 hours to ensure packet diversity. The collected raw packets are preprocessed, including deduplication (removing duplicate packets to avoid redundant calculations), filtering (retaining packets related to core protocols and filtering packets of irrelevant protocols), and format standardization (converting packets to a unified hexadecimal format and standardizing field separators), resulting in a preprocessed set of raw packets. , ( (Number of preprocessed messages);
[0033] Preprocessed raw message set With protocol template library For each original message Match the corresponding protocol template (Match protocol type based on message header features), according to the protocol template The field order, field length, and field constraints defined in the message... Perform field-by-field parsing to extract field values, field types, and field dependencies, and organize the parsing results into a field sequence. ,in, For the number of fields, For the first The parsed values of each field are used to clarify the attributes and relationships of each field, providing direct operation objects for subsequent scene feature extraction and mutation operations;
[0034] S2, calculate the statistical features of the field sequence as the current scene feature vector, the statistical features include field value distribution entropy, field variation historical response rate and field type identifier;
[0035] In this embodiment, the statistical characteristics of the field sequence are calculated as the current scene feature vector, and the method is as follows:
[0036] S201, for each field Count all values that it has taken in history (the set of values) ), calculate each value frequency of occurrence ( For taking values The frequency of occurrence (the number of times each frequency appears), the negative value of the product of each frequency and its logarithm, and the sum of these products, yield the field value distribution entropy. The calculation formula is: ,in, The field value distribution entropy, with a value range of 100%. The higher the entropy value, the more discrete the field values are, and the easier it is to trigger anomalies through mutation. Specifically, it is a vector composed of the entropy values of all fields (dimension L), the field mutation history response rate r is a vector (dimension L), and the field type one-hot vector t is a concatenation of the 4-dimensional one-hot vectors of each field (dimension 4L). Therefore, the total dimension of the scene feature vector f=[H,r,t] is L+L+4L=6L.
[0037] S202, retrieve historical mutation test records and analyze each field. Number of mutations (attempts) Number of successful exception triggers The historical response rate of a field's mutation is obtained by dividing the number of times the field is triggered by an anomaly by the sum of the number of mutations in that field and a preset smoothing factor. The formula expression is ,in, A preset smoothing factor (value 0.001) is used to avoid a denominator of 0; the historical response rates of all fields are organized into a vector. ;
[0038] S203, Encode the field type identifier as a one-hot vector (4-dimensional one-hot vector), the field types include enumeration type, numeric type, length type and computation type, which provides a basis for subsequent customized mutation strategies (such as enumeration traversal strategies to adapt to enumeration type fields) and ensures that mutation operations match the field types;
[0039] S204, concatenate the field value distribution entropy, the field variation historical response rate, and the field type one-hot vector to form the current scene feature vector. ,in, (L is the number of fields) The field value distribution entropy, field mutation historical response rate, and field type one-hot vector are subjected to min-max normalization to map the values to the [0,1] interval. The total dimension of the scene feature vector f is 1+L+4, which ensures the accuracy of the subsequent Euclidean distance calculation results, avoids the imbalance of dimension weights, and provides a unified comparison dimension for subsequent similar scene retrieval and strategy selection.
[0040] S3, retrieve historical scene features similar to the current scene feature vector from the experience replay pool, calculate the basic confidence upper bound value based on the historical statistical information of each mutation strategy, and weight and fuse the basic confidence upper bound value with the similar scene experience score to dynamically select the mutation strategy. The similar scene experience score is calculated based on the retrieved historical scene features.
[0041] In this embodiment, existing mutation strategies are mostly selected randomly or by fixing, which cannot balance the exploratory and exploitative aspects of the strategy and is prone to getting trapped in local optima. They do not reuse historical testing experience, resulting in highly arbitrary strategy selection and low efficiency in triggering anomalies. Furthermore, they cannot adapt to the needs of different testing phases and struggle to balance the comprehensiveness and efficiency of vulnerability verification. Therefore, a dynamic mutation strategy selection method combining experience replay, strategy statistics, and multi-agent division of labor is adopted. The specific method is as follows:
[0042] 1. Input field sequence Scene feature vector Build a seed queue It is used to store the seeds to be tested (a combination of field sequences and scene features); at the same time, based on the field types of the core protocol, five types of dedicated mutation strategies are customized to construct a mutation strategy set. The applicable scenarios and operations for each strategy are as follows:
[0043] 1. (Enumeration Traversal Strategy): Applicable to enumeration fields, iterates through all legal values and boundary illegal values of the field to generate mutated field values;
[0044] 2. (Numerical Boundary Strategy): Applicable to numerical fields (boundary vulnerabilities in numerical fields are a common type of vulnerability in industrial equipment). Select boundary values of the field's value range and typical error values (such as negative numbers, zero values, and out-of-range values) as mutation values.
[0045] 3. (Length tampering strategy): Applicable to length-type fields, it modifies the length field value to make it mismatched with the length of the associated data field;
[0046] 4. (Checksum mutation strategy): Applicable to calculated fields (such as checksums), it tampers with the checksum value or data field content to make the checksum mismatch;
[0047] 5. (Combined mutation strategy): Applicable to related fields with dependencies, it mutates multiple related fields at the same time to verify anomalies caused by field synergy, breaking through the limitations of single field mutation and covering vulnerabilities caused by multi-field synergy anomalies.
[0048] In addition, historical statistics for each mutation strategy were initialized, including the number of attempts. Number of successes (Initial values are all 0), output the initialized seed queue. Mutation strategy set and strategy statistics ;
[0049] The method for retrieving historical scene features similar to the current scene feature vector from the experience replay pool, calculating the basic confidence upper bound based on the historical statistical information of each mutation strategy, and then weighted and fused the basic confidence upper bound with the similar scene experience score to dynamically select the mutation strategy is as follows:
[0050] Input the current scene feature vector With experience replay pool (Store historical scene features, selected mutation strategies, and anomaly trigger results), calculate the current scene feature vector. Features of each historical scene in the experience replay pool The Euclidean distance is given by the formula:
[0051] ;
[0052] Where D is the feature vector of the current scene. Dimensions Let represent the scene feature vector of the k-th historical record in the experience replay pool; based on this, calculate the similarity scene experience score for each mutation strategy, as follows:
[0053] Retrieve historical scene features similar to the current scene feature vector from the experience replay pool, and calculate the similarity scene experience score for each mutation strategy; the method for calculating the similarity scene experience score is: traverse the experience replay pool. For each historical record in the data, calculate the current scene features. With the characteristics of the recorded scene The square of the Euclidean distance The distance is converted into similarity weights using a Gaussian kernel function. The number of times the strategy was selected in the historical records is summed according to the similarity weights, and then divided by the sum of the similarity weights of all historical records and the sum of the preset smoothing factor to obtain the similarity scene experience score of the strategy. The specific formula is as follows:
[0054] ;
[0055] in, This represents the total number of historical records in the experience replay pool. For the first The similarity weights of historical records are obtained by transforming them using a Gaussian kernel function, and the formula is as follows: , This is the Gaussian kernel parameter (in this embodiment, the value is 1.0). For indicator functions, when the first The strategy for selecting the first historical record is... When a mutation strategy is selected, the value is 1; otherwise, it is 0. A preset smoothing factor (consistent with the smoothing factor in S202, with a value of 0.001) is used to avoid the denominator being 0; For the first The experience scores for similar scenarios for each mutation strategy, with values ranging from [value range missing]. A higher score indicates that the strategy is more adaptable to similar historical scenarios;
[0056] Set similarity threshold ,like If a similar scenario is found, it is considered a similar scenario and is included in the calculation of the similar scenario experience score; if no similar scenario exists, it is marked as "no similar historical scenario", thereby obtaining the search results and the similar scenario experience score for each mutation strategy.
[0057] Based on the historical number of attempts and successes of each mutation strategy, the baseline confidence upper bound (UCB value) for each mutation strategy is calculated using the following formula:
[0058] ;
[0059] Among them, the first item For strategy Historical success rate (utilization item), second item For the exploration of the balancing strategy, Historical statistical information for each variation strategy, This represents the current total number of attempts. This represents the basic confidence upper bound for each mutation strategy;
[0060] The comprehensive score for each mutation strategy is obtained by weighted summation of the baseline confidence upper bound and the similar scenario experience score; the fusion formula is:
[0061] ;
[0062] in, The fusion coefficient (0.6 in this embodiment) prioritizes ensuring the dominant role of the current strategy's statistics. Calculate the experience score for similar scenarios for each mutation strategy; select the mutation strategy with the highest score as the current selected strategy.
[0063] Select Overall Score The highest mutation strategy is selected as the mutation strategy. If no similar historical scenario exists, then directly select the upper bound of the basic confidence level. The strategy with the largest mutation is selected as the current strategy. By combining historical experience (scores for similar scenarios) and real-time statistics (UCB value), the strategy can be accurately selected, ensuring the effectiveness of the current strategy while reusing historical experience.
[0064] Second, it also includes using a multi-agent dynamic division of labor mechanism to obtain the selected mutation strategy. The multi-agent system includes an exploratory agent and an exploitative agent, wherein:
[0065] The exploratory agent aims to maximize field coverage, and its selection criterion is: to choose a mutation strategy that maximizes the product of the square root of the reciprocal of the number of attempts and twice the square root of the logarithm of the total number of attempts, expressed by the formula:
[0066] ;
[0067] in, This represents the current total number of attempts. For strategy Number of attempts;
[0068] The method of utilizing an agent to maximize the anomaly trigger rate is selected based on the following criterion: choosing the mutation strategy that maximizes the ratio of successful attempts to attempted attempts. The formula is as follows:
[0069] ;
[0070] in, For strategy The number of successes Number of attempts;
[0071] Based on the current testing phase, the agent selection is dynamically switched, and the switching probability is calculated as follows: using the natural constant as the base and a negative decay coefficient. Multiplying the current total number of attempts by the ratio of the expected maximum number of tests, and then performing an exponential operation, this is combined with the preset minimum exploration probability, and the maximum value is taken to obtain the selection probability Pexplore of the exploring agent. The formula is as follows:
[0072] ;
[0073] in, ;
[0074] The chosen mutation strategy is based on probability. Using the selection results of the exploratory agent, to The selected mutation strategy is output based on the selection results of the intelligent agent.
[0075] Furthermore, the multi-agent selection results are fused with the selection results obtained from the previous UCB value plus similar scenario experience scores:
[0076] ;
[0077] The value of β is 0.7, based on the principle of prioritizing the dominant role of UCB+ experience score. The score is the comprehensive score of the j-th mutation strategy after weighted fusion of the basic confidence upper bound and the experience score of similar scenarios. To explore agent-policy The rating (normalized to [0,1]). To leverage agents to implement policies The ratings (normalized to [0,1])
[0078] choose Maximum value strategy As the final selected mutation strategy;
[0079] It adapts to the needs of different testing phases (focusing on exploration in the early stage and utilization in the later stage), dynamically balances coverage and efficiency, avoids the limitations of a single intelligent agent, further optimizes the scientific nature of strategy selection, and improves the overall performance of vulnerability verification.
[0080] S4, according to the selected mutation strategy, mutate the seed packet to generate test cases, send them to the target device for verification, and obtain the target device's response result; wherein, the seed packet is selected from the original network packet;
[0081] In this embodiment, because the mutation strategy cannot be converted into executable test cases, or the test cases do not conform to the protocol specifications, normal interaction with the target device is impossible; the test case execution process is not standardized, and the device response results cannot be accurately obtained, resulting in a lack of data support for subsequent vulnerability determination, strategy updates, and causal attribution. Therefore, the specific method for generating test cases by mutating seed packets according to the selected mutation strategy is as follows: The selected mutation strategy and field sequence are obtained, and the field sequence is mutated according to the strategy type: If it is an enumeration traversal strategy, all legal and illegal values of the field are traversed to generate a mutated field sequence; if it is a numerical boundary strategy, the field is replaced with a boundary value or error value; if it is a length tampering strategy, the length field and associated data field are modified to make them mismatched; if it is a checksum mutation strategy, the checksum or data field is tampered with; if it is a combined mutation strategy, multiple associated fields are mutated simultaneously; the mutated field sequence is then encapsulated into test cases conforming to the target device's communication protocol format.
[0082] ;in, The mutated field sequence;
[0083] For different core protocols (Modbus / TCP, S7comm, etc.), adapt the encapsulation format (such as header, checksum, and data field format) to the corresponding protocol; simultaneously set the test case sending parameters: sending interval of 1 second, retries of 2 times, and timeout of 5 seconds (these values are chosen to avoid device rejection and reduce misjudgments caused by network fluctuations). The test cases are then transmitted via an industrial network interface (such as Ethernet or RS485). Send the message to the target device, set a timeout (5 seconds in this embodiment), and wait for a response from the target device. If no response is received within the timeout period, it is considered as no response. If a response is received, parse the response message, extract the response status code, response data, and device operating status, and output the target device's response result. The response results are divided into four categories: normal response, abnormal response, no response, and abnormal behavior (device restart, connection interruption).
[0084] S5, Update the historical statistical information of each mutation strategy and the experience replay pool according to the response result of the target device;
[0085] In this embodiment, the historical statistical information and experience replay pool of each mutation strategy are updated based on the response result of the target device as follows:
[0086] Determine the response result of the target device Is it an anomaly (abnormal response, no response, and abnormal behavior are all considered abnormal)? If yes, increment the success count of the selected mutation strategy (the currently selected mutation strategy) by one, and also increment the attempt count of that strategy by one; if no (i.e., normal response), increment the attempt count of the selected mutation strategy by one; after the update, synchronously store the statistical information of all strategies. Output the updated statistics to the strategy statistics database;
[0087] The current scene feature vector Selected mutation strategy Target device response results and current strategy statistics (Combined into one record) Store in the experience replay pool If the experience replay pool exceeds the preset capacity (10,000 records in this embodiment), a first-in, first-out (FIFO) mechanism is used to delete the earliest stored record to ensure the storage efficiency of the replay pool, thereby obtaining an updated experience replay pool. ;
[0088] Each time the strategy statistics are updated, the meta-knowledge base is updated synchronously. The core storage content of the meta-knowledge base is the historical statistics of each mutation strategy. Each record contains the protocol type, target device model, and corresponding strategy statistics tuple. The strategy statistics tuple contains the number of attempts and the number of successes for each mutation strategy. That is, the format of each record is as follows:
[0089] ;
[0090] During the update, the statistical information of each mutation strategy after the current update will be included. The corresponding record of protocol type and target device model in the meta-knowledge base is compared. If the deviation is within the preset threshold (set to 5% in this embodiment), the corresponding entry in the meta-knowledge base is updated synchronously. If the deviation exceeds the threshold, it is marked as "statistical anomaly". The deviation value and possible influencing factors (such as differences in test environment and different device firmware versions) are recorded and fed back to the strategy optimization module. The strategy optimization module adjusts the initial weight of the corresponding mutation strategy according to the deviation value and influencing factors, and at the same time calibrates the baseline value of the corresponding entry in the meta-knowledge base K, and outputs the meta-knowledge base update result.
[0091] It also includes a periodic reset mechanism: after each update of the historical statistics of each mutation strategy based on the response result of the target device, if no anomaly is triggered for a preset number of consecutive times, the historical statistics of each mutation strategy are reset to the initial value of the meta-knowledge base, and the experience replay pool is cleared. Specifically, a counter is used. Record the number of consecutive tests that do not trigger an exception; after each update of the strategy statistics, determine... Has the preset threshold been reached (set to 1000 in this embodiment)? If so, the historical statistical information of each mutation strategy will be retrieved. Reset to the baseline value in the meta-knowledge base Clear the historical records in the experience replay pool, reselect typical message field sequences from the preprocessed original message set , and execute processes S3 to S5; if not reached, keep the current statistics unchanged and output the counter update result. ;
[0092] The initialization of the meta-knowledge base is achieved through the collection, organization, and adaptation calculation of historical test task statistics. The historical statistical information of each mutation strategy is stored in the meta-knowledge base. The initialization method of the meta-knowledge base is as follows:
[0093] Collect complete statistical information from multiple historical test tasks. Each historical test task must clearly include three core elements: the protocol type used in the test, the target device model, and the number of attempts and successes of each mutation strategy in the task. Ensure that the collected statistical information is complete and accurate, covering different protocol types and target devices of different models. The number of historical tasks collected should not be less than 50 to ensure the universality and representativeness of the meta-knowledge base.
[0094] All collected historical test task statistics are standardized and organized to construct a meta-knowledge base. Each record in the meta-knowledge base follows a unified format. ,in This is a strategy statistics tuple corresponding to the protocol type and device model, showing the number of attempts and successes for the five types of mutation strategies, which facilitates subsequent retrieval and invocation;
[0095] When a new target device is encountered, it is preferable to retrieve a specified number (in this embodiment, 5) of target devices from the meta-knowledge base that have the same protocol type and the most similar model to the current target device. The mutation strategy statistics of these target devices are then used to perform a weighted average (the weights are set according to the similarity of the device models; the higher the similarity, the greater the weight) to obtain the initial values of the historical statistics of each mutation strategy of the current target device. Instead of the default initial value of 0, if there is no similar device record in the meta knowledge base, the default initial value of 0 is returned, and the initialization of the new device strategy statistics is completed. There is no need to explore from scratch, thus shortening the cold start time.
[0096] S6. Traditional vulnerability verification can only confirm the existence of anomalies, but cannot identify the core cause of the anomaly (which type / which mutated fields caused the anomaly), resulting in poor interpretability; it lacks standardized verification records, and subsequent vulnerability reproduction, root cause analysis, and remediation lack precise basis, resulting in a large workload for manual analysis; it cannot provide a clear direction for subsequent strategy optimization. When the target device's response result is abnormal, active intervention experiments are performed on the mutated fields in the test case in sequence, and the causal effect of each field is judged by comparing the response, generating verification records containing causal attribution results;
[0097] In this embodiment, when the target device responds abnormally, active intervention experiments are performed sequentially on the fields that have mutated in the test case. The causal effect of each field is determined by comparing the responses, and a verification record containing causal attribution results is generated, including the following steps:
[0098] S601, determine the test case that triggered the exception and its corresponding seed message; specifically, obtain the test case that triggered the exception. Original field sequence Compare the mutated field sequences With the original field sequence ;
[0099] S602 identifies the set of mutated fields in test cases by comparing test cases with seed messages; it accurately identifies the mutated fields and constructs a set of mutated fields. , Record the original value of each mutated field, representing the number of mutated fields. with variation value , ;
[0100] S603, for each variant field Construct comparative test cases , where, except for this field Restore to the original value in the seed message Apart from the test cases, all other fields remain the same to generate comparison test cases. ;
[0101] S604, send each comparison test case sequentially. Upon reaching the target device, observe whether any anomalies are triggered (using the same sending parameters as S4 (timeout, network interface), perform verification, and record the response results for each comparative test case). Output the set of response results for all comparison test cases. ;
[0102] S605, if no anomalies are found in the comparison test cases after restoring the field to its original value, then the mutation of this field is determined to be the critical cause of the anomaly; if anomalies still exist in the comparison test cases after restoring the field to its original value, then the mutation of this field is determined not to be the critical cause of the anomaly; specifically, compare the response results of each comparison test case. Compared with the original abnormal response results :like If the response is normal (consistent with the response when the target device is operating normally), then the mutated field is determined. The variation is the key cause of the abnormality, causal effect value ;like If the response is still abnormal, then the mutated field is determined. The variation is not the key cause of the abnormality; causal effect value ;like If there is no response or abnormal behavior, it is judged as "uncertain" and the causal effect value is [not specified]. Record the influencing factors;
[0103] Organize the causal effect values of all variant fields into a causal effect vector. ;
[0104] The following information is integrated to generate verification records: target device information (model, firmware version, IP address), test timestamp, test cases that triggered the anomaly (hexadecimal format), set of variant fields, causal effect value of each variant field, and causal attribution conclusion (critical cause field, non-critical cause field, uncertain field). For anomalies triggered by combined variant strategies (m5), a synergy judgment step is added during causal attribution: according to the field dependency order (e.g., length field takes precedence over data field), multiple associated variant fields are restored to their original values simultaneously, a synergy comparison test case is constructed, and if no anomaly is found after restoration, it is determined that the anomaly is caused by multi-field synergy. The causal effect value is uniformly marked as (CEj=0.8) and added to the causal attribution conclusion. The verification records are standardized and can be retrieved by device model, test time, and protocol type, outputting verification records containing causal attribution results.
[0105] S7. Due to the poor reproducibility of vulnerability verification results, manual verification cannot accurately recreate the vulnerability triggering scenario, resulting in a low success rate of reproduction. Manual verification lacks standardized guidance, leading to chaotic operations and low efficiency. Vulnerability confirmation lacks clear judgment criteria, making misjudgment easy. Subsequent vulnerability repair and review lack complete test data support. Therefore, a reproduction package and reproduction guidance are generated based on the aforementioned verification records for manual verification and confirmation of vulnerabilities.
[0106] In this embodiment, generating a reproduction package and reproduction instructions based on the verification records includes the following steps:
[0107] The test case messages and session context that triggered the exception are packaged in chronological order to generate a PCAP file, which serves as the reproduction package. Specifically, the input includes verification records containing causal attribution results. The test cases that triggered the exception, the set of variant fields, the response results of the target device, and the response results of each comparison test case are extracted. This information is then organized in chronological order and written into a PCAP file (which can be opened with Wireshark). Remarks (test parameters, network environment, device information) are added to generate the reproduction package. The naming convention for the reproduction package is "device model-protocol type-test time.pcap", and the storage path is uniformly " / vulnerability verification / reproduction package / device model / ", which facilitates retrieval and management;
[0108] Generate textual instructions that include environmental requirements, reproduction steps, expected responses, and causal attribution results, serving as reproduction guidelines; specifically, input verification records and reproduction packages. The document provides a text-based reproduction guide, comprising four core parts:
[0109] Clearly define the target device model, firmware version, network configuration (IP address, port, communication protocol, subnet mask, gateway), and required software (packet capture tool, testing tool) to ensure that the environment for manual reproduction is consistent with the original testing environment;
[0110] Describe in detail the test case sending process, mutation strategy selection process, and comparison test case execution steps, and clarify the operation parameters of each step (such as sending timeout time, field mutation method) to ensure that it can be directly reproduced step by step;
[0111] Clearly define the abnormal responses to be observed during the reproduction process (such as response codes, device behavior), and the expected responses of each comparison test case, so as to facilitate comparison and judgment during manual verification;
[0112] Annotate possible anomalies during the reproduction process (such as network timeout, unstable device status) and corresponding solutions to improve the success rate of reproduction;
[0113] Security analysts obtain the reproduction package and reproduction instructions. In a compliant test environment, they follow the reproduction instructions step by step to observe whether the target device's response matches the description in the verification log. If they match, the vulnerability is confirmed, and manual verification opinions and verification time are added and stored in the database. If they do not match, they are marked as "environmentally sensitive," and possible influencing factors (such as differences in device load, different firmware minor versions, and network interference) are recorded. The specific troubleshooting process after reproduction failure is as follows: 1) Environmental parameter verification: Verify the consistency of device model, firmware version, network configuration (including subnet mask and gateway), device load (CPU utilization ≤ 50%), and network latency (≤ 10ms) with the original test environment; 2) Operation step verification: Backtrack the operation log step by step against the reproduction instructions to confirm that the test case sending parameters and mutation methods are without deviation; 3) Device status check: Check whether the device is in normal operating condition and no other processes are occupying core resources. If reproduction still cannot be achieved after the above three steps, record the troubleshooting results and feed them back to the policy optimization module.
[0114] The following uses the Modbus / TCP protocol as an example to verify whether a PLC (Programmable Logic Controller, i.e., the target device) has vulnerabilities related to function code processing. The specific process is as follows:
[0115] Phase 1: Protocol Template Construction and Raw Message Processing
[0116] Based on the Modbus / TCP public protocol specification, a dedicated protocol template is constructed. This template includes core fields such as transaction identifier, protocol identifier, length field, unit identifier, function code field, and data field, clearly defining the legal value range and field dependencies for each field. The function code field is enumerated, with a legal value range of 1-127. The length of the data field dynamically changes depending on the function code, ensuring the accuracy and adaptability of the protocol template.
[0117] The Modbus network traffic of this PLC model under normal communication scenarios was collected using the Wireshark packet capture tool for no less than 24 hours, covering typical scenarios such as PLC startup, data reading, and status feedback. The collected raw messages were deduplicated and filtered preprocessed, duplicate messages were deleted, and irrelevant protocol traffic was filtered. The processed messages were then parsed field by field according to the protocol template to obtain a standardized field sequence, laying the foundation for subsequent scenario feature extraction.
[0118] Phase 2: Scene Feature Vector Extraction
[0119] For the parsed field sequence, its statistical characteristics are calculated and a current scene feature vector is constructed. The statistical characteristics include field value distribution entropy, field mutation historical response rate, and field type identifier. Among them, the function code field is a core field, and its value is relatively fixed (legal value 1-127), so the value distribution entropy of this field is low. Combined with the statistics of historical test data, the probability of triggering device abnormalities after mutation of the function code field is high, so its mutation historical response rate is relatively high. This feature provides an important basis for the selection of subsequent mutation strategies.
[0120] Phase 3: Mutation Strategy Selection and Test Case Verification
[0121] 3.1 Initialize the seed queue and mutation strategy set: The seed queue initially contains the field sequence corresponding to the typical Modbus request message of this PLC model, such as read coil request (function code 0x01) and read holding register request (function code 0x03); the mutation strategy set is adapted to the field characteristics of the Modbus / TCP protocol, including 5 types of special strategies such as function code enumeration traversal, register address boundary mutation, length field tampering, checksum error mutation, etc.
[0122] 3.2 Strategy Selection and Test Case Generation: Assuming the current test seed is the field sequence corresponding to the register read request (function code 0x03, starting address 0, register quantity 1), retrieve historical test records (which are Modbus test data of the same type of PLC) from the experience replay pool that are similar to the feature vector of the current scenario. Calculate the basic confidence upper bound (UCB value) and similar scenario experience score for each mutation strategy. After weighted fusion, select the function code enumeration traversal strategy with the highest adaptability.
[0123] 3.3 Verification Execution and Result Acquisition: Based on the selected function code enumeration traversal strategy, the function code in the seed message is changed to the illegal value 0x7F, while the other fields remain unchanged. This is then encapsulated into a test case conforming to the Modbus / TCP protocol format and sent to the target PLC via the industrial Ethernet interface. A timeout of 5 seconds is set. After receiving the test case, the target PLC returns an exception response code 0x02 (indicating an illegal data address), which is determined to be an abnormal response. The historical statistics of the function code enumeration traversal strategy are updated synchronously (attempt count +1, success count +1), and the current scenario characteristics, selected strategy, and response results are integrated and stored in the experience replay pool.
[0124] Phase 4: Strategy Iteration and Periodic Reset
[0125] 4.1 Dynamic switching of multiple agents: As the test progresses, the total number of attempts gradually increases. The selection probability of the exploration agent gradually decreases from the initial value of 1.0 to 0.1 according to the preset decay formula. In the early stage of the test, the exploration agent takes the lead (maximizing the field coverage), and in the later stage of the test, the exploitation agent takes the lead (maximizing the anomaly trigger rate), balancing the comprehensiveness and efficiency of the test.
[0126] 4.2 Meta-knowledge base adaptation and periodic reset: The meta-knowledge base stores Modbus / TCP test statistics for multiple other PLC models (similar to the PLC model currently being tested). This is used to initialize the historical statistics of the mutation strategy for the current PLC, shortening the cold start time of new equipment. When the test runs 500 times and there are 100 consecutive instances where the exception is not triggered, the counter reaches the preset threshold, triggering the periodic reset mechanism. This resets the historical statistics of each mutation strategy to the meta-knowledge base baseline value, clears the experience replay pool, reinitializes the seed queue, and continues the test process.
[0127] Phase 5: Causal Attribution and Vulnerability Identification
[0128] 5.1 Anomaly Triggering and Active Intervention Experiment: In a certain test, a combined mutation strategy was used to change the function code to 0x05 (write single coil function) and set the data field length to 2 (the data field length corresponding to this function code should be fixed at 1). Test cases were generated and sent to the target PLC. The PLC showed no response and triggered the anomaly record.
[0129] 5.2 Causal Effect Judgment: Two sets of comparative test cases were constructed to conduct an active intervention experiment: In the first set of comparative test cases, the function code was restored to its original value of 0x03, and the other fields remained unchanged (the data field length was still 2). After sending, the PLC still did not respond, indicating that the variation of the data field length field was the key factor in triggering the anomaly. In the second set of comparative test cases, the data field length was restored to 1, and the function code remained 0x05. After sending, the PLC responded normally (returning the corresponding anomaly code). It was finally confirmed that the combination variation of the function code and the data field length was the core reason for the PLC's lack of response.
[0130] 5.3 Reproduction Verification and Vulnerability Confirmation: Based on the above causal attribution results, a PCAP reproduction package is generated, containing test cases that trigger the anomaly, comparison test cases, and response results. Simultaneously, a standardized reproduction guide is generated, clearly defining the reproduction environment (PLC model, firmware version, network configuration), operating steps, and expected response. The security analyst manually executes the test according to the reproduction guide, successfully reproducing the vulnerability and confirming that this PLC model has a vulnerability in handling the combination of function codes and data field lengths, thus completing the vulnerability verification loop.
[0131] Therefore, by constructing a preprocessing mechanism coupled with protocol template library and message field semantic parsing, accurate parsing and field sequence extraction of network messages are achieved. Compared with traditional fuzzy testing methods without protocol awareness, the proportion of invalid test cases is significantly reduced. A strategy selection mechanism integrating experience replay pool retrieval and confidence upper bound algorithm, combined with dynamic multi-agent division of labor, accurately matches the optimal mutation strategy in different scenarios, breaking the blind limitations of traditional strategy selection and significantly improving the efficiency of vulnerability anomaly triggering. Through the coupled design of proactive intervention experiments and causal attribution analysis, the causal relationship between mutation fields and abnormal responses is clarified, providing accurate evidence for vulnerability root cause analysis. Based on the initial meta-knowledge base... The method employs an initialization and dynamic update mechanism, combined with weighted adaptation based on statistical information from similar devices, to precisely address the cold start challenge of new target devices, overcoming the efficiency bottleneck of traditional new device verification from scratch. Through a linkage mechanism of periodic reset and meta-knowledge base calibration, it avoids strategy selection getting stuck in local optima. Combined with standardized generation of reproducibility packages and reproducibility guidelines, it balances the comprehensiveness and reproducibility of verification results, avoiding the shortcomings of traditional verification methods such as difficulty in tracing results and low success rates of manual reproduction. Simultaneously, multi-dimensional statistical feature construction and scenario vector representation enable accurate characterization of test scenarios, while the collaborative optimization of the meta-knowledge base and experience replay pool ensures the scientific nature of strategy iteration, significantly reducing testing cycles and manual verification workload. Furthermore, this method is compatible with mainstream industrial internet protocols and general industrial network interfaces, adapting to different models of industrial equipment, reducing engineering deployment costs, and providing reliable technical support for accurate verification, rapid repair, and batch device security testing of industrial internet device vulnerabilities. This drives the intelligent upgrade of industrial internet security verification from random trial and error to semantic-driven and experience-empowered approaches.
[0132] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process and related explanations of the methods described above can be found in the corresponding processes in the foregoing system embodiments, and will not be repeated here.
[0133] Those skilled in the art will recognize that the modules and method steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. The programs corresponding to the software modules and method steps can be placed in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or any other form of storage medium known in the art. To clearly illustrate the interchangeability of electronic hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in electronic hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the invention.
[0134] The terms “first”, “second”, etc., are used to distinguish similar objects, not to describe or indicate a specific order or sequence.
[0135] The term "comprising" or any other similar term is intended to cover non-exclusive inclusion, such that a process, method, article, or target device / apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or target device / apparatus.
[0136] The technical solution of the present invention has been described above with reference to the preferred embodiments shown in the accompanying drawings. However, it will be readily understood by those skilled in the art that the scope of protection of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or substitutions to the relevant technical features, and the technical solutions after such changes or substitutions will all fall within the scope of protection of the present invention.
Claims
1. An industrial internet vulnerability verification method, characterized in that, Includes the following steps: A protocol template library is constructed based on publicly available protocol specifications, and field sequences are obtained by parsing the original network packets based on the protocol template library. The statistical features of the field sequence are calculated as the current scene feature vector. The statistical features include field value distribution entropy, field variation historical response rate, and field type identifier. Retrieve historical scene features similar to the current scene feature vector from the experience replay pool, calculate the basic confidence upper bound value based on the historical statistical information of each mutation strategy, and weight and fuse the basic confidence upper bound value with the similar scene experience score to dynamically select the mutation strategy. The similar scene experience score is calculated based on the retrieved historical scene features. The seed packet is mutated according to the selected mutation strategy to generate test cases, which are then sent to the target device for verification, and the response result from the target device is obtained; wherein, the seed packet is selected from the original network packet; Update the historical statistics of each mutation strategy and the experience replay pool based on the response results of the target device; When the target device responds abnormally, active intervention experiments are performed on the fields that have mutated in the test case in sequence. The causal effect of each field is judged by comparing the responses, and a verification record containing causal attribution results is generated. Based on the verification records, a reproduction package and reproduction instructions are generated for manual verification and confirmation of the vulnerability.
2. The industrial internet vulnerability verification method according to claim 1, characterized in that, The statistical characteristics of the field sequence are calculated as the current scene feature vector, and the method is as follows: For each field, count all its values that have appeared in history, calculate the frequency of each value, and sum the products of each frequency and its logarithm after taking the negative value to obtain the field value distribution entropy. Divide the number of times an anomaly is triggered in a field by the sum of the number of mutations in that field and the preset smoothing factor to obtain the historical response rate of the field mutation. The field type identifier is encoded as a one-hot vector, and the field type includes enumeration type, numeric type, length type and calculation type; The field value distribution entropy, the field variation historical response rate, and the field type one-hot vector are concatenated to form the current scene feature vector.
3. The industrial internet vulnerability verification method according to claim 1, characterized in that, The method for retrieving historical scene features similar to the current scene feature vector from the experience replay pool, calculating the basic confidence upper bound based on the historical statistical information of each mutation strategy, and then weighted and fused the basic confidence upper bound with the similar scene experience score to dynamically select the mutation strategy is as follows: Calculate the basic confidence upper bound for each mutation strategy based on the historical number of attempts and successes. Retrieve historical scene features that are similar to the current scene feature vector from the experience replay pool, and calculate the similar scene experience score for each mutation strategy; The method for calculating the similar scene experience score is as follows: traverse each historical record in the experience replay pool, calculate the square of the Euclidean distance between the current scene feature and the scene feature of the record, convert the distance into a similarity weight using a Gaussian kernel function, sum the number of times the strategy was selected in the historical records according to the similarity weight, and then divide by the sum of the similarity weights of all historical records and the sum of the preset smoothing factor to obtain the similar scene experience score of the strategy. The comprehensive score for each mutation strategy is obtained by weighted summation of the basic confidence upper bound and the similar scenario experience score. The mutation strategy with the highest overall score is selected as the chosen mutation strategy.
4. The industrial internet vulnerability verification method according to claim 1, characterized in that, The method for updating the historical statistics and experience replay pool of each mutation strategy based on the target device response result is as follows: Determine whether the response result of the target device is abnormal. If so, increment the success count of the selected mutation strategy by one and the number of attempts for the strategy by one; otherwise, increment the number of attempts for the selected mutation strategy by one. The current scene feature vector, the selected mutation strategy, the target device response result, and the current strategy statistics are stored in the experience replay pool.
5. The industrial internet vulnerability verification method according to claim 1, characterized in that, It also includes employing a multi-agent dynamic division of labor mechanism to obtain the selected mutation strategy, wherein the multi-agent includes an exploratory agent and an exploitative agent, wherein: The exploratory agent aims to maximize field coverage, and its selection criterion is: to select a mutation strategy that maximizes the product of the square root of the reciprocal of the number of attempts and twice the square root of the logarithm of the total number of attempts. The selection criterion for utilizing an agent to maximize the anomaly trigger rate is: selecting a mutation strategy that maximizes the ratio of the number of successes to the number of attempts. The selection of the agent is dynamically switched according to the current testing phase. The switching probability is calculated as follows: the natural constant is used as the base, and the negative decay coefficient is multiplied by the ratio of the current total number of attempts to the expected maximum number of tests as the exponent. The maximum value is then taken by the preset minimum exploration probability to obtain the selection probability of the exploration agent. The selected mutation strategy adopts the selection result of the exploratory agent with probability Pexplore, and adopts the selection result of the exploiting agent with probability 1−Pexplore.
6. The industrial internet vulnerability verification method according to claim 1, characterized in that, Historical statistical information for each mutation strategy is stored in a meta-knowledge base. The meta-knowledge base is initialized as follows: Collect statistical information from multiple historical test tasks. Each task includes the protocol type, target device model, and the number of attempts and successes for each mutation strategy. The collected statistical information is stored as a meta-knowledge base, with each record containing the protocol type, target device model, and corresponding policy statistical tuple; When a new target device is encountered, a specified number of target devices with the same protocol type and the most similar model to the current target device are retrieved from the meta-knowledge base. The mutation strategy statistics of these target devices are then weighted and averaged to obtain the initial values of the historical statistics of each mutation strategy of the current target device.
7. The industrial internet vulnerability verification method according to claim 1, characterized in that, When the target device responds abnormally, active intervention experiments are performed sequentially on the fields that have mutated in the test case. The causal effect of each field is determined by comparing the responses, and a verification record containing causal attribution results is generated, including the following steps: Identify the test cases that trigger the exception and their corresponding seed messages; By comparing test cases with seed messages, the set of fields that have mutated in the test cases can be identified; For each mutated field, construct a comparison test case, in which, except for the field being restored to its original value in the seed message, the other fields remain the same as the test case; Send each comparison test case to the target device in sequence and observe whether any anomalies are triggered. If no anomalies are found when comparing test cases after restoring the original value of the field, then the variation of the field is determined to be the key cause of the anomaly. If the test cases still show anomalies after restoring the field to its original value, then the variation in the field is not the key cause of the anomaly.
8. The industrial internet vulnerability verification method according to claim 1, characterized in that, Based on the verification records, a reproduction package and reproduction instructions are generated, including the following steps: Package the test case messages and session context that triggered the exception into a PCAP file in chronological order to create a reproduction package; Generate textual instructions that include environmental requirements, reproduction steps, expected responses, and causal attribution results, serving as reproduction guidelines.
9. The industrial internet vulnerability verification method according to claim 1, characterized in that, The publicly available protocol specifications include mainstream protocols such as Modbus / TCP, S7comm, Ethernet / IP, Profinet, and OPC UA, while other protocols are handled using a general binary fuzzy approach.
10. The industrial internet vulnerability verification method according to claim 1, characterized in that, It also includes a periodic reset mechanism: after each update of the historical statistical information of each mutation strategy based on the response result of the target device, if no anomaly is triggered for a preset number of consecutive times, the historical statistical information of each mutation strategy is reset to the initial value of the meta-knowledge base and the experience replay pool is cleared.