A malware automatic detection method, system, device and medium

By statically analyzing the source code of industrial internet malware and constructing a hardware resource access dependency graph, resource conflicts in concurrent scenarios are identified, overcoming the shortcomings of existing technologies in detecting resource conflicts and achieving efficient detection of concurrent hardware resource conflicts.

CN122197014APending Publication Date: 2026-06-12JIUQUAN VOCATIONAL & TECHNICAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIUQUAN VOCATIONAL & TECHNICAL UNIVERSITY
Filing Date
2026-03-17
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies are insufficient to effectively detect resource conflicts caused by concurrent access to hardware resources by malicious software in the Industrial Internet, which threatens the operational stability and production safety of industrial control equipment.

Method used

By performing static analysis on the source code of industrial internet malware, an abstract syntax tree and control flow graph are constructed to determine the contextual characteristics of hardware resource access. In combination with the resource configuration characteristics of the hardware platform, an access dependency graph is constructed to identify resource conflict instances and defective call chains in concurrent scenarios, and a concurrent conflict verification report is generated.

🎯Benefits of technology

It enables accurate detection of resource conflict defects caused by concurrent access of malicious software in the industrial internet, improves the reliability and interpretability of detection results, provides traceable defect causes, and reduces false positives and false negatives.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122197014A_ABST
    Figure CN122197014A_ABST
Patent Text Reader

Abstract

The application provides a malware automatic detection method, system, device and medium, relates to the technical field of automatic detection, performs program static analysis on source code of industrial internet malware, obtains an abstract syntax tree and a control flow graph of the source code, determines context features of hardware resource access of the source code, performs hardware resource flow access analysis on the source code, obtains an access dependency graph of the hardware resource of the source code, identifies resource conflict instances and defect call links that violate concurrent security rules of the source code in a concurrent scenario according to the context features and the access dependency graph, performs concurrent conflict verification on the source code through the resource conflict instances and the defect call links, obtains a concurrent conflict verification result of the source code, and further generates a defect detection report of concurrent conflict of the industrial internet malware on the hardware resource. The application can detect resource conflict defects caused by concurrent access of the industrial internet malware on the hardware resource.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of automatic detection technology, and more specifically, to a method, system, device, and medium for automatic detection of malicious software. Background Technology

[0002] With the deep integration of the Industrial Internet and industrial control systems, the interaction between software and underlying hardware resources is becoming increasingly close. As a key means to ensure the security and reliability of industrial software, automatic detection technology undertakes important functions such as code defect screening, abnormal behavior identification, and security risk warning. Automatic detection technology can perform security analysis and hidden danger investigation of industrial software in an automated manner, effectively resist attacks and damage from malicious software, provide solid technical support for the stable scheduling of industrial hardware resources and the safe operation of industrial scenarios, and is an important support for maintaining the security ecosystem of the Industrial Internet.

[0003] In the current field of industrial software security testing, most detection technologies focus on general software vulnerabilities and functional logic defects. However, they have significant limitations in detecting industrial internet malware. Traditional methods are difficult to adapt to concurrent execution scenarios such as multi-threading and multi-processing, and cannot effectively combine hardware resource characteristics to analyze software access behavior. They also lack the ability to accurately identify resource contention and illegal occupation caused by concurrent access, making it difficult to detect related security defects in a timely manner. This seriously threatens the operational stability and production safety of industrial control equipment. Therefore, how to detect resource conflict defects caused by concurrent access to hardware resources by industrial internet malware has become a difficult problem for the industry. Summary of the Invention

[0004] This application provides a method, system, device, and medium for automatic detection of malicious software, which can detect resource conflict defects caused by concurrent access to hardware resources by malicious software in the industrial Internet.

[0005] In a first aspect, this application provides an automatic detection method for malicious software, the automatic detection method comprising the following steps: Obtain the source code of industrial internet malware; Static analysis is performed on the source code to obtain an abstract syntax tree and a control flow graph. The contextual features of the source code's access to hardware resources are determined based on the abstract syntax tree and the control flow graph. By analyzing the hardware resource flow access of the source code using the resource configuration characteristics of the hardware platform, an access dependency graph of the source code on hardware resources is obtained. Based on the context features and the access dependency graph, resource conflict instances and defective call chains that violate concurrency security rules in the source code under concurrent scenarios are identified; By identifying resource conflict instances and defective call chains, the source code is subjected to concurrent conflict verification to obtain concurrent conflict verification results. Then, based on the concurrent conflict verification results, a defect detection report of industrial internet malware on hardware resource concurrent conflicts is generated.

[0006] In this embodiment, performing static analysis on the source code to obtain the abstract syntax tree and control flow graph specifically includes: Lexical and syntactic analysis are performed on the source code to generate an abstract syntax tree for the source code; Control flow analysis is performed based on the abstract syntax tree to construct the control flow graph of the source code.

[0007] In this embodiment, determining the contextual features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph specifically includes: The hardware resource access points in the source code are identified through the abstract syntax tree; Determine the unique identifier of the hardware resource and the operation type for each hardware resource access point; Based on the control flow graph, the function attributes of each hardware resource access point and its position in the control flow are analyzed, and then the execution context attributes of each hardware resource access point are determined. The source code's contextual characteristics for accessing hardware resources are generated using all unique identifiers of hardware resources, operation types, and execution context attributes.

[0008] In this embodiment, hardware resource flow access analysis is performed on the source code based on the resource configuration characteristics of the hardware platform to obtain the access dependency graph of the source code to hardware resources, specifically including: Determine the resource configuration characteristics of the hardware platform; The access constraint rules of the source code to hardware resources are determined based on the resource configuration characteristics of the hardware platform. Based on the access constraint rules, resource flow access dependency analysis is performed on the source code to obtain the data dependency and control dependency relationships between the hardware resource access points in the source code. The source code is used to construct an access dependency graph of hardware resources by means of the data dependencies and control dependencies between the hardware resource access points in the source code.

[0009] In this embodiment, identifying resource conflict instances and defective call chains that violate concurrency security rules in concurrent scenarios based on the context features and the access dependency graph specifically includes: Load the predefined concurrency security rule library; Based on the aforementioned contextual features, a concurrent analysis is performed on the execution units in the source code to obtain a set of hardware resource access points that are executed concurrently. The hardware resource access point set is matched with the concurrency security rule base to identify resource conflict instances in the source code that violate concurrency security rules in a concurrent scenario. Based on the access dependency graph, the path backtracking of the identified resource conflict instances is performed to obtain the defective call chain of the source code that violates the concurrency security rules in a concurrent scenario.

[0010] In this embodiment, the concurrent conflict verification of the source code is performed by identifying resource conflict instances and defective call chains, and the concurrent conflict verification result of the source code specifically includes: Concurrent test stub functions are generated by identifying instances of resource conflicts and defective call chains; The source code is subjected to conflict verification based on the concurrent test stub function to obtain the concurrent conflict verification result of the source code.

[0011] In this embodiment, generating a defect detection report on concurrent conflicts of hardware resources by industrial internet malware based on the concurrent conflict verification results specifically includes: The impact level of industrial internet malware on hardware resource concurrency conflicts is extracted from the concurrent conflict verification results. Based on the defect impact level, a defect detection report is generated regarding the concurrent conflicts of hardware resources caused by industrial internet malware.

[0012] In this embodiment, the resource conflict instance is an instance of hardware resource access conflict caused by violation of concurrency security rules.

[0013] In this embodiment, the defect call chain is the function call path from the program entry point to the resource conflict instance.

[0014] Secondly, this application provides an automatic malware detection system for performing an automatic malware detection method, the automatic detection system comprising: The code acquisition module is used to acquire the source code of industrial internet malware. The program analysis module is used to perform static analysis on the source code to obtain the abstract syntax tree and control flow graph of the source code, and to determine the context features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph. The resource access analysis module is used to perform hardware resource flow access analysis on the source code through the resource configuration characteristics of the hardware platform, and obtain the access dependency graph of the source code to hardware resources. The concurrency defect identification module is used to identify resource conflict instances and defective call chains in the source code that violate concurrency security rules in a concurrent scenario, based on the context features and the access dependency graph. The detection report generation module is used to perform concurrent conflict verification on the source code by identifying resource conflict instances and defective call chains, obtain the concurrent conflict verification results of the source code, and then generate a defect detection report of industrial Internet malware on hardware resource concurrent conflicts based on the concurrent conflict verification results.

[0015] Thirdly, this application provides a computer device including a memory and a processor, the memory storing code, and the processor being configured to acquire the code and execute the aforementioned automatic malware detection method.

[0016] Fourthly, this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the aforementioned automatic malware detection method.

[0017] The technical solutions provided by the embodiments disclosed in this application have the following beneficial effects: The process involves: acquiring the source code of industrial internet malware; performing static analysis on the source code to obtain an abstract syntax tree and control flow graph; determining the contextual features of the source code's access to hardware resources based on the abstract syntax tree and control flow graph; performing hardware resource flow access analysis on the source code using the resource configuration features of the hardware platform to obtain an access dependency graph of the source code to hardware resources; identifying resource conflict instances and defective call chains in the source code that violate concurrency security rules under concurrent scenarios based on the contextual features and access dependency graph; verifying the source code for concurrency conflicts using the identified resource conflict instances and defective call chains to obtain the concurrency conflict verification results; and then generating a defect detection report on the concurrency conflicts of industrial internet malware to hardware resources based on the concurrency conflict verification results.

[0018] Therefore, this application can generate a defect detection report on concurrent conflicts of hardware resources by industrial internet malware based on the concurrent conflict verification results. Firstly, through static analysis of the source code execution program of industrial internet malware, an abstract syntax tree and control flow graph are constructed, and further, the contextual features of hardware resource access are extracted. This process establishes a static semantic association between code structure, execution logic, and hardware access behavior, enabling the system to accurately understand the context of each resource operation, thereby avoiding missed detections due to incomplete structural analysis. Secondly, hardware resource flow access analysis is performed in conjunction with the resource configuration characteristics of the hardware platform to construct a hardware resource access dependency graph. This step directly compensates for the shortcomings of traditional static analysis being detached from the hardware environment, making the detection process truly consistent with the key characteristics of specific hardware access timing, priority, interrupt nesting, etc., and enabling the construction of a more accurate report. The system first identifies resource dependencies in actual execution semantics to reduce false positives and false negatives. Next, based on contextual features and access dependency graphs, it identifies resource conflict instances that violate concurrency security rules in concurrent scenarios and their corresponding defective call chains. This step enables automatic inference from resource access behavior to potential conflict relationships, exposing typical concurrency defects such as race conditions and non-atomic accesses in advance, effectively addressing the pain point that manual review struggles to cover complex concurrent interactions. Then, the system verifies concurrency conflicts in the source code using the identified resource conflict instances and defective call chains, enabling semantic-level confirmation of suspected problems, filtering out false positives, and improving the credibility of the detection results. Finally, based on the verification results, a hardware resource concurrency conflict defect report is generated, providing information security personnel with traceable and explainable defect causes, and allowing defect information to be directly utilized in the software security testing process.

[0019] In summary, the technical solution adopted in this application can detect resource conflict defects caused by concurrent access to hardware resources by malicious software in the industrial Internet. Attached Figure Description

[0020] Figure 1 This is an exemplary flowchart of the method of the present invention; Figure 2 This is a flowchart illustrating the process of determining the access dependency graph according to the present invention; Figure 3 This is a flowchart illustrating the process of determining resource conflict instances and defective call chains according to the present invention; Figure 4 This is a module structure diagram of an automatic malware detection system according to the present invention; Figure 5 This is a schematic diagram of the computer device for automatically detecting malware according to the present invention. Detailed Implementation

[0021] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0022] This application provides an automatic malware detection method, system, device, and medium. Its core is to obtain the source code of industrial internet malware; perform static analysis on the source code to obtain an abstract syntax tree and control flow graph; determine the contextual features of the source code's access to hardware resources based on the abstract syntax tree and control flow graph; perform hardware resource flow access analysis on the source code using the resource configuration features of the hardware platform to obtain an access dependency graph of the source code to hardware resources; identify resource conflict instances and defective call chains in the source code that violate concurrency security rules under concurrent scenarios based on the contextual features and access dependency graph; perform concurrent conflict verification on the source code using the identified resource conflict instances and defective call chains to obtain concurrent conflict verification results; and then generate a defect detection report of industrial internet malware's concurrent conflicts with hardware resources based on the concurrent conflict verification results.

[0023] refer to Figure 1 As shown, this figure is an exemplary flowchart of an automatic malware detection method according to this embodiment of the present application. The automatic detection method includes the following steps: In step S1, the source code of industrial internet malware is obtained.

[0024] In practice, the source code of industrial internet malware is obtained through reverse engineering.

[0025] In step S2, static analysis is performed on the source code to obtain the abstract syntax tree and control flow graph of the source code, and the context features of the source code's access to hardware resources are determined based on the abstract syntax tree and the control flow graph.

[0026] In this embodiment, performing static analysis on the source code to obtain the abstract syntax tree and control flow graph can be achieved through the following steps: Lexical and syntactic analysis are performed on the source code to generate an abstract syntax tree for the source code; Control flow analysis is performed based on the abstract syntax tree to construct the control flow graph of the source code.

[0027] It should be noted that the abstract syntax tree described in this application represents a tree-like representation of the source code syntax structure, used to reflect the logical organization of the program and the hierarchical relationship of code elements; the control flow graph represents a directed graph describing the control flow path during the execution of the source code, used to reveal the execution order and branching conditions between code blocks.

[0028] In specific implementation, the lexical and syntactic analysis of the source code to generate a program syntax structure representation can be achieved in the following way: The source code is used as input to an integrated Clang compiler front-end library to initialize a Clang compilation instance. In this instance, all compilation parameters, including preprocessor macros and header file search paths, are accurately configured to completely simulate the actual build environment of the target code. Subsequently, the Clang lexical analyzer is called to scan the source code character stream and generate a token sequence. Then, the syntactic analyzer, according to the syntax rules of the programming language, parses the token sequence into an initial Clang abstract syntax tree with complete type and source code location information, and uses the resulting Clang abstract syntax tree as the abstract syntax tree of the source code.

[0029] In specific implementation, control flow analysis is performed based on the abstract syntax tree. The control flow graph of the source code can be constructed in the following way: traverse each function definition node in the abstract syntax tree. For each function, analyze all statements within its function body, and divide the continuous code sequence into basic blocks according to the statement type and control transfer keywords. Each basic block is a single-entry, single-exit maximum continuous instruction sequence. Subsequently, analyze the jump relationships between basic blocks: for conditional statements, create directed edges from the current block to the target block for true / false branches; for loop statements, create back edges from the loop body to the condition judgment block; for unconditional jumps or function returns, create corresponding edges. This process constructs a graph structure with basic blocks as nodes and control flow transfers as directed edges, and this graph structure serves as the control flow graph of the source code.

[0030] In this embodiment, determining the contextual features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph can be achieved through the following steps: The hardware resource access points in the source code are identified through the abstract syntax tree; Determine the unique identifier of the hardware resource and the operation type for each hardware resource access point; Based on the control flow graph, the function attributes of each hardware resource access point and its position in the control flow are analyzed, and then the execution context attributes of each hardware resource access point are determined. The source code's contextual characteristics for accessing hardware resources are generated using all unique identifiers of hardware resources, operation types, and execution context attributes.

[0031] It should be noted that, in this application, the hardware resource access point refers to the code location in the source code that directly performs read or write operations on hardware resources; the execution context attribute refers to the characteristics of the execution environment in which the hardware resource access point is located; and the context feature refers to the identification features constituted by the execution context of the source code's concurrent access to hardware resources.

[0032] In specific implementation, identifying hardware resource access points in the source code through the abstract syntax tree can be achieved in the following way: First, a hardware resource access pattern library is preset. This library defines common hardware access syntax patterns in embedded systems, including direct pointer dereferences to memory-mapped I / O addresses, access to peripheral register structure members, and calls to specific inline assembly instructions. Second, the abstract syntax tree is traversed using a depth-first search algorithm. For each syntax node, it is matched with the hardware resource access pattern library. For example, for pointer dereference nodes, it is checked whether the address constant it points to falls within the range defined by the pre-configured hardware resource address mapping table; for structure member access nodes, it is checked whether its structure type name exists in the known list of peripheral register structure types; for inline assembly nodes, its assembly template string is parsed to check whether it contains explicit references to specific hardware registers. When a node matches any pattern, it is marked as a hardware resource access point, and its position in the abstract syntax tree is recorded.

[0033] In practical implementation, the unique hardware resource identifier and operation type of each hardware resource access point can be determined in the following ways: For hardware resource access points accessed through memory-mapped I / O addresses, the specific numerical address is parsed from the pointer constant expression and normalized into a standard format string as the unique hardware resource identifier of the hardware resource access point; for hardware resource access points accessed through peripheral structures, the structure instance variable name and member name are combined to generate a dotted fraction string (e.g., "GPIOA.ODR") as the unique hardware resource identifier of the hardware resource access point. Simultaneously, the operation type is determined by the context role of the hardware resource access point in the assignment expression: if the hardware resource access point is the left operand of the assignment operator, the operation type is "write"; if it is the right operand or an independent expression, the operation type is "read"; for hardware resource access points in inline assembly, the operation type is obtained by parsing the opcode of the assembly instruction. Ultimately, each hardware resource access point is associated with a unique hardware resource identifier and an operation type (read / write).

[0034] In specific implementation, the function attributes of each hardware resource access point and its position in the control flow are analyzed based on the control flow graph. The execution context attributes of each hardware resource access point can be determined in the following way: First, by querying the metadata of the control flow graph, the function to which each hardware resource access point belongs is located. The function attributes are obtained by parsing the declaration node of the function in the abstract syntax tree: checking whether the function is modified by compiler-specific attributes such as __interrupt, __attribute__((isr)), etc., to determine if it is an interrupt service routine. Second, an iterative data flow analysis method is used to construct a dominator tree for the function to which each hardware resource access point belongs. Based on the dominator tree and the back edges in the control flow graph, all natural loops and their corresponding loop head nodes are identified. The method to determine whether a hardware resource access point is located within a loop body is: in the dominator tree, all ancestor nodes of the basic block to which the hardware resource access point belongs are checked. If it contains the head node of an identified loop, then the hardware resource access point is located within that loop body. Finally, the working list method is used to analyze the impact of synchronization primitives and determine critical section attributes: For each basic block in the function to which each hardware resource access point belongs, an abstract set of synchronization states (e.g., "interrupt disabled", "lock A held") is defined and maintained. Starting from the basic block at the function entry point, the state changes are calculated according to the semantics of the statements within the basic block (e.g., when encountering taskENTER_CRITICAL(), the "interrupt disabled" state is added to the set). The exit state is then propagated to the entry state of the subsequent basic block along the control flow edge (this can be achieved through the union operation of sets). This process is iterated until the state sets of all basic blocks no longer change. At this point, the entry state set of the basic block to which the hardware resource access point is located accurately describes all active synchronization protections at that point, thereby determining whether the access point is in a critical section and what kind of synchronization primitive protects it. Based on the above analysis results, a structured execution context object is generated for each hardware resource access point, and the obtained execution context objects are used as the execution context attributes of the corresponding hardware resource access point.

[0035] In specific implementation, the generation of the context features of the source code accessing hardware resources through all the unique identifiers of the hardware resources, the operation type, and the execution context attributes can be achieved in the following way: traverse all hardware resource access points, and for each hardware resource access point, merge its extracted unique identifiers of hardware resources, operation type, and execution context attributes into a complete, structured feature record. Finally, arrange and store the feature records corresponding to all hardware resource access points according to their order of appearance in the source code, and use the features obtained after arrangement and storage as the context features of the source code accessing hardware resources.

[0036] In step S3, the source code is analyzed for hardware resource flow access based on the resource configuration characteristics of the hardware platform to obtain the access dependency graph of the source code to hardware resources.

[0037] Preferably, in this embodiment, hardware resource flow access analysis is performed on the source code based on the resource configuration characteristics of the hardware platform to obtain the access dependency graph of the source code to hardware resources, with reference to... Figure 2 As shown, this figure is a flowchart illustrating the process of determining the access dependency graph in some embodiments of this application. In this embodiment, determining the access dependency graph can be achieved using the following steps: In step S31, the resource configuration characteristics of the hardware platform are determined; In step S32, the access constraint rules of the source code to hardware resources are determined according to the resource configuration characteristics of the hardware platform; In step S33, resource flow access dependency analysis is performed on the source code according to the access constraint rules to obtain the data dependency relationship and control dependency relationship between each hardware resource access point in the source code; In step S34, the access dependency graph of the source code to hardware resources is constructed through the data dependency and control dependency relationships between the hardware resource access points in the source code.

[0038] It should be noted that, in this application, the resource configuration features refer to the structured features of the address mapping, access attributes, and configuration information of various hardware resources in the target hardware platform; the access constraint rules refer to the constraint rules followed by the source code when accessing hardware resources; the data dependency relationship refers to the dependency relationship between hardware resource access points caused by data transmission; the control dependency relationship refers to the dependency relationship between hardware resource access points caused by program control flow conditions; and the access dependency graph refers to the directed graph of the mutual influence paths between hardware resource access points.

[0039] In practice, determining the resource configuration characteristics of a hardware platform can be achieved in the following way: Obtain the hardware description file provided by the target hardware platform. The hardware description file is usually in a structured format. Use an XML / JSON parser to load and parse the file. During the parsing process, extract the key hardware resource information defined in the file, including: the absolute address range and access attributes of all memory-mapped input / output registers, the interrupt vector table configuration (including interrupt number, default entry address of interrupt service routine, interrupt priority and grouping information), and the resource configuration of complex peripherals such as DMA channels and timers. This information is converted and organized into a structured, machine-readable resource configuration information, and the obtained resource configuration information is used as the resource configuration characteristics of the hardware platform.

[0040] In specific implementation, the access constraint rules of the source code on hardware resources can be determined according to the resource configuration characteristics of the hardware platform in the following way: traverse all peripheral registers in the resource configuration characteristics of the hardware platform. For each register, extract its access attributes and bit field description text. Pre-set a keyword knowledge base containing status indicators and operation indicators. Analyze the bit field description of each register using a text pattern matching algorithm (e.g., word segmentation and keyword extraction based on regular expressions or natural language processing). When the description of a bit field contains a status indicator, mark the bit field as a status bit. When the description of a register contains an operation indicator and its access attribute is writable, mark the register as an operation register. When there is a status bit and an operation register in the same peripheral, generate the rule "Before writing operation register X, you must read the register containing status bit Y and verify that Y is in a valid state." Then, formally represent all generated rules as predicate logic assertions, and use all predicate logic assertions as the access constraint rules of the source code on hardware resources.

[0041] In specific implementation, resource flow access dependency analysis is performed on the source code according to the access constraint rules. The data dependency and control dependency relationships between hardware resource access points in the source code can be obtained in the following way: First, a definition-use chain is constructed for each function in the source code using the working list method, thereby establishing data dependencies between hardware resource access points with direct value passing; Second, a reverse graph of the control flow graph is constructed (reversing the direction of all edges in the graph). On this reverse graph, the dominance relationship of each basic block is calculated using the working list method and represented as a tree structure. Subsequently, the subsequent dominance front is calculated for each basic block. The calculation rule is: for basic block Y, if there exists a predecessor basic block X such that Y does not strictly dominate X, then Y is added to the set of subsequent dominance fronts of X; Next, the control dependency relationship is determined by traversing each edge of the control flow graph. For each edge A pointing from basic block A to basic block B... If B is not in the post-dominant frontier of A, it means the transition from A to B is unconditional or part of the post-dominant path, and no control dependency is generated. Conversely, if B is in the post-dominant frontier of A, it means the execution of B depends on the branch conditions in A. In this case, the execution of basic block B and all basic blocks subsequently dominated by B are control dependent on A. For each pair of basic blocks with a control dependency relationship (control block A, controlled block C), each hardware resource access point in control block A is associated with each hardware resource access point in controlled block C, establishing a control dependency edge from access points in A to access points in C, thereby generating control dependencies between all hardware resource access points. In the above analysis... During the process, hardware access constraint rules are deeply integrated: when the analysis encounters a target hardware resource access point, the system queries in real time whether there is a constraint targeting that point (e.g., "register B must be read and its specific location verified before writing register A"). If it exists, this constraint is transformed into a logical precursor condition that must be satisfied. The system then searches backwards along the control flow path of the current function for a source resource access point that satisfies the condition. If a source resource access point that satisfies the condition is found on a reachable path, an enhanced data dependency edge enforced by the hardware rules is established between the two. If no such edge is found on a reachable path, the system records that the path violates the hardware constraint, but the violation itself does not affect the established dependency relationship.

[0042] In a specific implementation, the access dependency graph of the source code to hardware resources can be constructed by the following method based on the data dependencies and control dependencies between the hardware resource access points in the source code: taking all hardware resource access points as a set of nodes, taking the data dependencies and control dependencies between the hardware resource access points in the source code as the source of the edge set, using an incremental graph construction algorithm based on adjacency lists to construct a directed graph data structure, and using the resulting directed graph data structure as the access dependency graph of the source code to hardware resources.

[0043] In step S4, resource conflict instances and defective call chains that violate concurrency security rules in concurrent scenarios are identified based on the context features and the access dependency graph.

[0044] Preferably, in this embodiment, resource conflict instances and defective call chains that violate concurrency security rules in concurrent scenarios are identified based on the context features and the access dependency graph, with reference to... Figure 3 As shown in the figure, this is a flowchart illustrating the process of determining resource conflict instances and defective call chains in some embodiments of this application. In this embodiment, determining resource conflict instances and defective call chains can be achieved through the following steps: In step S41, a predefined concurrency security rule base is loaded; In step S42, the execution units in the source code are analyzed concurrently based on the context features to obtain a set of hardware resource access points that are executed concurrently. In step S43, the set of hardware resource access points is matched with the concurrency security rule base to identify resource conflict instances in which the source code violates concurrency security rules in a concurrent scenario. In step S44, the path backtracking of the identified resource conflict instances is performed based on the access dependency graph to obtain the defective call chain of the source code that violates the concurrency security rules in a concurrent scenario.

[0045] It should be noted that the concurrency security rule base mentioned in this application refers to a rule base for determining the security standards of concurrent access in a concurrent execution environment; the resource conflict instance is an instance of hardware resource access conflict caused by violation of concurrency security rules; and the defective call chain is the function call path from the program entry point to the resource conflict instance.

[0046] In specific implementation, loading the predefined concurrency safety rule library can be achieved in the following way: the concurrency safety rule library is loaded by reading a structured rule configuration file, which is in YAML or JSON format. The concurrency safety rule library defines common concurrency safety rules in embedded systems. Each rule contains the following elements: rule unique identifier, rule description, and violation conditions. The violation conditions are usually described using a declarative language, such as: "If there are two or more access points to the same writable hardware resource, and these access points may be executed concurrently in different execution contexts, and the access operations are all non-atomic writes or one read and one write, and are not protected by the same synchronization primitive, then this rule is violated."

[0047] In specific implementation, the concurrent scenario modeling of the execution units in the source code based on the context features to obtain the set of concurrently executed hardware resource access points can be achieved in the following way: According to the operating environment of the target embedded system, define a set of concurrent execution body types, such as: main loop, multiple interrupt service routines with different priorities, and multiple RTOS tasks; secondly, traverse the execution context attributes of each hardware resource access point in the context features (e.g., function type marked as interrupt service routine, task entry function, etc.), and map the hardware resource access point to the corresponding concurrent execution body. For example, if a hardware resource access point is marked as ISR, it is classified into the "interrupt service routine" execution body; if its function attribute is marked as RTOS_TASK, it is classified into a specific task execution body according to the task ID; then, according to the concurrent scenario, determine whether any two hardware resource access points may be executed concurrently: if two hardware resource access points belong to different execution bodies and their execution times may overlap, then it is considered that these two hardware resource access points may be executed concurrently. Finally, group and organize all hardware resource access points that may be executed concurrently to form a set of concurrently executed hardware resource access points.

[0048] In specific implementation, the hardware resource access point set is matched with the concurrency security rule base to identify resource conflict instances in which the source code violates concurrency security rules under concurrent scenarios. This can be achieved in the following way: traverse the hardware resource access point set, and for each shared hardware resource, obtain its corresponding list of hardware resource access points that may be accessed concurrently; then, for each pair of hardware resource access points (denoted as point A and point B) in the list, perform the following operations: first, obtain the operation type and synchronization protection information of the two hardware resource access points from the context features; then, set the pair of hardware resource access points... The information is matched against each rule in the concurrency safety rule base. The matching process is essentially checking whether the information meets the violation conditions of a certain rule. For example, if there is a "non-atomic write-write conflict" rule in the concurrency safety rule base, the conditions are: the operation type of both hardware resource access points is write, neither of them are atomic operations, they have no shared synchronization protection, and they may be executed concurrently. Then, when points A and B meet all these conditions, it is determined that the rule is violated. Once a violation is found, the set of hardware resource access points A and B is taken as the resource conflict instance of the source code violating the concurrency safety rule in the concurrent scenario.

[0049] In specific implementation, the defective call chain of the source code violating concurrency safety rules in a concurrent scenario can be obtained by backtracking the path of the identified resource conflict instances according to the access dependency graph in the access dependency graph. This can be achieved in the following way: Using the access dependency graph as input, a hardware resource access point is randomly selected from the identified resource access conflict instances. A reverse depth-first traversal is performed along the edges of the access dependency graph. During backtracking, the call dependency edges pointing to the current node are searched in reverse, tracing upwards level by level to the caller of the function until the program entry point (e.g., the main function or the entry function associated with the interrupt vector table) is reached. During the backtracking process, all function nodes passed are recorded to form a function call chain from the entry point to the hardware resource access point. The obtained function call chain is used as the defective call chain of the source code violating concurrency safety rules in a concurrent scenario.

[0050] In step S5, the source code is subjected to concurrent conflict verification by identifying resource conflict instances and defective call chains to obtain concurrent conflict verification results of the source code, and then a defect detection report of industrial Internet malware on hardware resource concurrent conflicts is generated based on the concurrent conflict verification results.

[0051] In this embodiment, the concurrent conflict verification of the source code is performed by identifying resource conflict instances and defective call chains, and the concurrent conflict verification result of the source code can be obtained by the following steps: Concurrent test stub functions are generated by identifying instances of resource conflicts and defective call chains; The source code is subjected to conflict verification based on the concurrent test stub function to obtain the concurrent conflict verification result of the source code.

[0052] It should be noted that the concurrent test stub function mentioned in this application represents a simulated execution body function for verifying concurrent conflicts, used to reproduce concurrent access scenarios in a controlled environment to detect potential resource conflicts; the concurrent conflict verification result represents structured verification data obtained after verifying concurrent conflicts in the source code.

[0053] In practical implementation, generating concurrent test stub functions by identifying resource conflict instances and defective call chains can be achieved in the following way: First, for the identified resource conflict instances, the Jinja2 template engine is used to encapsulate the functions containing the two hardware resource access points in the resource conflict instance and all functions on the defective call chain. An independent simulated execution body function is created for each hardware resource access point. For example, for hardware resource access points in interrupt service routines, a function simulating interrupt triggering is generated; for hardware resource access points in the main loop or task, a function simulating normal execution is generated. At the same time, synchronization control points are inserted into these simulated execution body functions, such as using semaphores or event flags to precisely control the concurrent execution sequence, ensuring that the two hardware resource access points can be executed simultaneously or interleaved in the test to reproduce potential concurrent conflict scenarios. All generated simulated execution body functions are used as concurrent test stub functions.

[0054] In specific implementation, the source code is subjected to conflict verification based on the concurrent test stub function. The concurrent conflict verification result of the source code can be obtained in the following way: First, modify the Makefile build script of the industrial internet malware project, add the generated concurrent test stub function to the compilation target list, and specify the symbolic wrapper instruction of GCC / Clang in the compilation linking options to redirect the critical hardware access function to the monitoring stub function. This links the concurrent test stub function with the target source code, building an executable test program. In this test program, the concurrent test stub function is called to drive the source code execution, simulating a real concurrent scenario. During the test execution, a hardware resource access monitor is used to capture all requests to the target hardware in real time. The monitor intercepts read and write operations on resources by blocking accesses to memory-mapped I / O addresses or operations that replace peripheral register structures, recording the timestamp, operation type, and access value for each access. It then analyzes the monitoring data to check for violations of concurrency safety rules. For example, for the same hardware resource, if two write operations overlap in time and are not protected by atomicity, a write-write conflict is identified; similarly, if a write operation overlaps with a read operation and is not protected by synchronization, a read-write conflict is identified. Finally, all test execution results are summarized, and the conflict occurrence status and detailed monitoring evidence for each resource conflict instance are structurally recorded. This record serves as the concurrency conflict verification result of the source code.

[0055] In this embodiment, generating a defect detection report on concurrent conflicts of hardware resources by industrial internet malware based on the concurrent conflict verification results can be achieved through the following steps: The impact level of industrial internet malware on hardware resource concurrency conflicts is extracted from the concurrent conflict verification results. Based on the defect impact level, a defect detection report is generated regarding the concurrent conflicts of hardware resources caused by industrial internet malware.

[0056] It should be noted that the defect impact level mentioned in this application represents the severity of the defect causing conflict between industrial internet malware and concurrent access to hardware resources; the defect detection report represents the detection report of the defect causing conflict between industrial internet malware and concurrent access to hardware resources.

[0057] In specific implementation, the defect impact level of industrial internet malware on hardware resource concurrency conflicts can be extracted from the concurrency conflict verification results in the following way: load a preset defect impact level classification model, input the concurrency conflict verification results into the defect impact level classification model, and use the output of the defect impact level classification model as the defect impact level of industrial internet malware on hardware resource concurrency conflicts.

[0058] It should be noted that the defect impact level classification model described in this application can be constructed in the following way: First, collect confirmed hardware resource concurrency conflict cases and their subsequent impact assessment data from historical embedded software projects to form a training sample set. Each sample contains the concurrency conflict verification results and expert-annotated defect impact level labels (high, medium, low). Second, use a gradient boosting decision tree-based classification algorithm (e.g., XGBoost) to train the model. Specifically, divide the sample set into a training set and a validation set in a 7:3 ratio, perform standardized preprocessing on the features of the training set samples, then initialize the XGBoost classifier and set hyperparameters (including a maximum tree depth of 5, a learning rate of 0.1, and 100 iterations). Use multi-class log loss as the optimization objective, iteratively generate multiple decision trees through forward step-by-step addition modeling. Each tree minimizes the loss function by fitting the negative gradient of the current model. During training, five-fold cross-validation is used to prevent overfitting, and early stopping is used to terminate training when the validation set loss does not decrease for 5 consecutive rounds. Finally, use the trained model as the defect impact level classification model.

[0059] In specific implementation, generating a defect detection report for concurrent hardware resource conflicts caused by industrial internet malware based on the defect impact level can be achieved in the following way: First, load a preset structured defect detection report template, fill in the basic information in the report header such as the name, version number, hardware platform model, detection tool version, and detection time of the industrial internet malware, as well as the detection scope, the concurrent security rule base and hardware access constraint rules, etc.; then, iterate through all resource conflict instances that have been verified by concurrent conflict, and generate a report for each instance containing a unique defect identifier, defect type, information on the hardware resources involved, the source code location and execution context of the conflict access point, etc. The report is recorded in a structured detail, including the defect call chain, verification results, and monitoring data summary. The risk level is differentiated based on the defect's impact level: a high impact level is marked as "may cause hardware resource data corruption, system crash, or peripheral device malfunction"; a medium impact level is marked as "may cause abnormal peripheral data transmission or occasional functional failures"; and a low impact level is marked as "may cause non-critical data reading deviations, with no impact on core system functions." Finally, the report template is formatted, and the formatted report serves as the defect detection report for concurrent conflicts of hardware resources by industrial internet malware.

[0060] Therefore, this application can generate a defect detection report on concurrent conflicts of hardware resources by industrial internet malware based on the concurrent conflict verification results. Firstly, through static analysis of the source code execution program of industrial internet malware, an abstract syntax tree and control flow graph are constructed, and further, the contextual features of hardware resource access are extracted. This process establishes a static semantic association between code structure, execution logic, and hardware access behavior, enabling the system to accurately understand the context of each resource operation, thereby avoiding missed detections due to incomplete structural analysis. Secondly, hardware resource flow access analysis is performed in conjunction with the resource configuration characteristics of the hardware platform to construct a hardware resource access dependency graph. This step directly compensates for the shortcomings of traditional static analysis being detached from the hardware environment, making the detection process truly consistent with the key characteristics of specific hardware access timing, priority, interrupt nesting, etc., and enabling the construction of a more accurate report. The system first identifies resource dependencies in actual execution semantics to reduce false positives and false negatives. Next, based on contextual features and access dependency graphs, it identifies resource conflict instances that violate concurrency security rules in concurrent scenarios and their corresponding defective call chains. This step enables automatic inference from resource access behavior to potential conflict relationships, exposing typical concurrency defects such as race conditions and non-atomic accesses in advance, effectively addressing the pain point that manual review struggles to cover complex concurrent interactions. Then, the system verifies concurrency conflicts in the source code using the identified resource conflict instances and defective call chains, enabling semantic-level confirmation of suspected problems, filtering out false positives, and improving the credibility of the detection results. Finally, based on the verification results, a hardware resource concurrency conflict defect report is generated, providing information security personnel with traceable and explainable defect causes, and allowing defect information to be directly utilized in the software security testing process.

[0061] In summary, the technical solution adopted in this application can detect resource conflict defects caused by concurrent access to hardware resources by malicious software in the industrial Internet.

[0062] This application provides an automatic malware detection system, referenced... Figure 4 As shown in the figure, this is a module structure diagram of an automatic malware detection system according to this embodiment of the present application. The automatic detection system includes: Code acquisition module 100 is used to acquire the source code of industrial internet malware; The program analysis module 200 is used to perform static analysis on the source code to obtain the abstract syntax tree and control flow graph of the source code, and to determine the context features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph. The resource access analysis module 300 is used to perform hardware resource flow access analysis on the source code through the resource configuration characteristics of the hardware platform to obtain the access dependency graph of the source code to hardware resources. Concurrency defect identification module 400 is used to identify resource conflict instances and defective call chains in the source code that violate concurrency security rules in a concurrent scenario based on the context features and the access dependency graph; The detection report generation module 500 is used to perform concurrent conflict verification on the source code by identifying resource conflict instances and defective call chains, obtain the concurrent conflict verification results of the source code, and then generate a defect detection report of industrial Internet malware on hardware resource concurrent conflicts based on the concurrent conflict verification results.

[0063] In addition, this application also provides a computer device, the computer device including a memory and a processor, the memory storing code, and the processor being configured to acquire the code and execute the above-described automatic malware detection method.

[0064] In some embodiments, reference Figure 5 The figure is a schematic diagram of the structure of a computer device employing an automatic malware detection method according to some embodiments of this application. The automatic malware detection method in the above embodiments can... Figure 5 The computer device shown is used to implement this, and the computer device 500 includes at least one processor 501, a communication bus 502, a memory 503, and at least one communication interface 504.

[0065] The processor 501 may be a general-purpose central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more devices used to control the execution of the automatic malware detection method in this application.

[0066] The communication bus 502 can be used to transmit information between the aforementioned components.

[0067] Memory 503 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital universal optical discs, Blu-ray discs, etc.), magnetic disks or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. Memory 503 may exist independently and be connected to processor 501 via communication bus 502. Memory 503 may also be integrated with processor 501.

[0068] The memory 503 stores program code for executing the scheme of this application, and its execution is controlled by the processor 501. The processor 501 executes the program code stored in the memory 503. The program code may include one or more software modules. The method described in the above method embodiments can be implemented by the processor 501 and one or more software modules in the program code in the memory 503.

[0069] Communication interface 504 uses any transceiver-like device to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc.

[0070] In a specific implementation, as one example, a computer device may include multiple processors, each of which may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. Here, a processor may refer to one or more devices, circuits, and / or processing cores used to process data (e.g., computer program instructions).

[0071] The aforementioned computer device can be a general-purpose computer device or a special-purpose computer device. In specific implementations, the computer device can be a desktop computer, a portable computer, a network server, a handheld digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, or an embedded device. This application does not limit the type of computer device.

[0072] In addition, this application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described automatic malware detection method.

[0073] Although preferred embodiments of this application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application.

[0074] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. A method for automatically detecting malicious software, characterized in that, The automatic detection method includes the following steps: Obtain the source code of industrial internet malware; Static analysis is performed on the source code to obtain an abstract syntax tree and a control flow graph. The contextual features of the source code's access to hardware resources are determined based on the abstract syntax tree and the control flow graph. By analyzing the hardware resource flow access of the source code using the resource configuration characteristics of the hardware platform, an access dependency graph of the source code on hardware resources is obtained. Based on the context features and the access dependency graph, resource conflict instances and defective call chains that violate concurrency security rules in the source code under concurrent scenarios are identified; By identifying resource conflict instances and defective call chains, the source code is subjected to concurrent conflict verification to obtain concurrent conflict verification results. Then, based on the concurrent conflict verification results, a defect detection report of industrial internet malware on hardware resource concurrent conflicts is generated.

2. The automatic malware detection method as described in claim 1, characterized in that, Static analysis of the source code yields an abstract syntax tree and a control flow graph, specifically including: Lexical and syntactic analysis are performed on the source code to generate an abstract syntax tree for the source code; Control flow analysis is performed based on the abstract syntax tree to construct the control flow graph of the source code.

3. The automatic malware detection method as described in claim 1, characterized in that, Determining the contextual features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph specifically includes: The hardware resource access points in the source code are identified through the abstract syntax tree; Determine the unique identifier of the hardware resource and the operation type for each hardware resource access point; Based on the control flow graph, the function attributes of each hardware resource access point and its position in the control flow are analyzed, and then the execution context attributes of each hardware resource access point are determined. The source code's contextual characteristics for accessing hardware resources are generated using all unique identifiers of hardware resources, operation types, and execution context attributes.

4. The automatic malware detection method as described in claim 1, characterized in that, By analyzing the hardware resource flow access of the source code using the resource configuration characteristics of the hardware platform, the access dependency graph of the source code to hardware resources is obtained, specifically including: Determine the resource configuration characteristics of the hardware platform; The access constraint rules of the source code to hardware resources are determined based on the resource configuration characteristics of the hardware platform. Based on the access constraint rules, resource flow access dependency analysis is performed on the source code to obtain the data dependency and control dependency relationships between the hardware resource access points in the source code. The source code is used to construct an access dependency graph of hardware resources by means of the data dependencies and control dependencies between the hardware resource access points in the source code.

5. The automatic malware detection method as described in claim 1, characterized in that, Based on the context features and the access dependency graph, the resource conflict instances and defective call chains that violate concurrency security rules in the source code under concurrent scenarios are identified, specifically including: Load the predefined concurrency security rule library; Based on the aforementioned contextual features, a concurrent analysis is performed on the execution units in the source code to obtain a set of hardware resource access points that are executed concurrently. The hardware resource access point set is matched with the concurrency security rule base to identify resource conflict instances in the source code that violate concurrency security rules in a concurrent scenario. Based on the access dependency graph, the path backtracking of the identified resource conflict instances is performed to obtain the defective call chain of the source code that violates the concurrency security rules in a concurrent scenario.

6. The automatic malware detection method as described in claim 1, characterized in that, By identifying resource conflict instances and defective call chains, the source code is subjected to concurrent conflict verification. The specific results of the concurrent conflict verification of the source code include: Concurrent test stub functions are generated by identifying instances of resource conflicts and defective call chains; The source code is subjected to conflict verification based on the concurrent test stub function to obtain the concurrent conflict verification result of the source code.

7. The automatic malware detection method as described in claim 1, characterized in that, The defect detection report on hardware resource concurrency conflicts generated based on the concurrent conflict verification results of industrial internet malware specifically includes: The impact level of industrial internet malware on hardware resource concurrency conflicts is extracted from the concurrent conflict verification results. Based on the defect impact level, a defect detection report is generated regarding the concurrent conflicts of hardware resources caused by industrial internet malware.

8. An automatic malware detection system, characterized in that, The automatic detection system includes: The code acquisition module is used to acquire the source code of industrial internet malware. The program analysis module is used to perform static analysis on the source code to obtain the abstract syntax tree and control flow graph of the source code, and to determine the context features of the source code's access to hardware resources based on the abstract syntax tree and the control flow graph. The resource access analysis module is used to perform hardware resource flow access analysis on the source code through the resource configuration characteristics of the hardware platform, and obtain the access dependency graph of the source code to hardware resources. The concurrency defect identification module is used to identify resource conflict instances and defective call chains in the source code that violate concurrency security rules in a concurrent scenario, based on the context features and the access dependency graph. The detection report generation module is used to perform concurrent conflict verification on the source code by identifying resource conflict instances and defective call chains, obtain the concurrent conflict verification results of the source code, and then generate a defect detection report of industrial Internet malware on hardware resource concurrent conflicts based on the concurrent conflict verification results.

9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the automatic malware detection method according to any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the automatic malware detection method as described in any one of claims 1 to 7.