A memory vulnerability detection method and device, electronic equipment and storage medium

By converting the program source code into a syntax tree and analyzing the state information of memory operation nodes, the accuracy problem of memory vulnerability detection in existing technologies is solved, and more efficient memory vulnerability detection is achieved.

CN122241703APending Publication Date: 2026-06-19TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TENCENT TECHNOLOGY (SHENZHEN) CO LTD
Filing Date
2024-12-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies are prone to missed or false detections when detecting memory vulnerabilities in programs, making it difficult to accurately detect memory vulnerabilities.

Method used

The source code of the target program is converted into a target syntax tree. The syntax tree is traversed to filter out the set of nodes involving memory operations. Memory vulnerability detection is performed based on memory state information, and corresponding detection results are generated.

Benefits of technology

It improves the accuracy of memory vulnerability detection, enabling accurate detection of memory vulnerabilities in memory regions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241703A_ABST
    Figure CN122241703A_ABST
Patent Text Reader

Abstract

This application relates to the field of vulnerability detection, and more particularly to a method, apparatus, electronic device, and storage medium for detecting memory vulnerabilities, to improve the accuracy of memory vulnerability detection. The method includes: converting the source code of a target program into a target syntax tree, recording node attributes for each node; traversing the target syntax tree, and for each node that meets set conditions, searching the target syntax tree for other nodes connected to that node to obtain a node set; selecting a target node set from the obtained node sets, the target node set containing target nodes representing memory operations; for each target node in the target node set, analyzing the node attributes of each node in the node connection path to which the target node belongs, obtaining memory state information of the memory region operated on by the target node, and combining this with set vulnerability detection rules to perform memory vulnerability detection on the node connection path, obtaining memory vulnerability detection results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of vulnerability detection technology, and in particular to a memory vulnerability detection method, apparatus, electronic device, and storage medium. Background Technology

[0002] Memory vulnerabilities in programs typically refer to security or stability issues that arise during program execution due to improper memory management and manipulation. Memory vulnerabilities often stem from programming errors, especially in low-level programming languages ​​(such as C and C++), because these languages ​​allow direct access to and manipulation of memory.

[0003] In related technologies, memory vulnerabilities are typically detected by traversing the program's source code. However, because the source code usually contains a large number of code segments involving memory operations, and the syntax of the source code is relatively complex, it is easy to miss or falsely detect memory vulnerabilities when analyzing the source code.

[0004] Therefore, accurately detecting memory vulnerabilities in programs is an urgent problem to be solved. Summary of the Invention

[0005] This application provides a memory vulnerability detection method, apparatus, electronic device, and storage medium to improve the accuracy of memory vulnerability detection.

[0006] On the one hand, this application provides a memory vulnerability detection method, the method comprising:

[0007] The source code of the target program is converted into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntactic functions of the corresponding code element.

[0008] Traverse the target syntax tree; wherein, for each node traversed, when a node meets the set conditions based on the corresponding node attributes, search for at least one other node in the target syntax tree that has a connection relationship with the node, and obtain a set of nodes containing the connection relationship;

[0009] From the obtained multiple sets of nodes, at least one set of target nodes is selected, and each set of target nodes contains at least one target node representing the execution of memory operations;

[0010] For each target node in each target node set, perform the following operations: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules to obtain the corresponding memory vulnerability detection results.

[0011] On one hand, this application provides a memory vulnerability detection device, the device comprising:

[0012] A conversion module is used to convert the source code of a target program into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntactic functions of the corresponding code element.

[0013] The traversal module is used to traverse the target syntax tree; wherein, for each node traversed, when it is determined that the node meets the set conditions according to the corresponding node attributes, at least one other node with a connection relationship with the node is searched from the target syntax tree to obtain a set of nodes containing the connection relationship.

[0014] The filtering module is used to filter out at least one target node set from multiple obtained node sets, each target node set containing at least one target node representing the execution of memory operations;

[0015] The detection module is used to perform the following operations for each target node in each target node set: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtain the corresponding memory vulnerability detection results.

[0016] Optionally, the traversal module is specifically used for:

[0017] When a node is determined to be a node that performs memory operations based on its corresponding node attributes, at least one other node with a connection relationship to the node is searched in the target syntax tree to obtain a set of nodes containing the connection relationship.

[0018] When a node is determined to represent a key code element of a given syntax structure based on its corresponding node attributes, at least one other node contained in the given syntax structure is searched from the target syntax tree to obtain the corresponding node set.

[0019] Optionally, the setting syntax structure includes any of the following:

[0020] Determine the grammatical structure;

[0021] Choose the syntax structure;

[0022] Loop syntax structure.

[0023] Optionally, the device further includes a recording module for recording the subset of target nodes contained in each set of nodes;

[0024] The filtering module is specifically used for:

[0025] For each of the multiple node sets, the following operations are performed: if a node set contains a non-empty subset of target nodes, then that node set is determined to be the target node set.

[0026] Optionally, the detection module is specifically used for:

[0027] Obtain the node connection path to which the target node belongs from the set of target nodes to which the target node belongs;

[0028] Based on the node attributes of each node in the node connection path, determine the pointer variable information of the memory region to be operated on by the target node, and the pointer variable release information of the memory region;

[0029] Based on the pointer variable information pointing to the memory region and the pointer variable release information of the memory region, the memory state information of the memory region operated by the target node is obtained.

[0030] Optionally, the detection module is specifically used for:

[0031] Based on the memory state information, it was detected that after allocating a pointer variable in the memory region operated on by the target node, no memory release operation was performed on the pointer variable. Therefore, it was determined that the code segment corresponding to the node connection path had a memory allocation failure vulnerability.

[0032] A first error message for the memory allocation failure vulnerability is generated. Based on the first error message and the location information of the code segment corresponding to the node connection path in the source code, a memory vulnerability detection result is obtained.

[0033] Optionally, the detection module is specifically used for:

[0034] Based on the memory state information, after it is detected that a pointer variable is allocated to the memory region operated on by the target node, there is no pointer variable pointing to the memory region, and it is determined that the code segment corresponding to the node connection path has a pointer relocation vulnerability.

[0035] A second error message for the pointer relocation vulnerability is generated. Based on the second error message and the location information of the code segment corresponding to the node connection path in the source code, a memory vulnerability detection result is obtained.

[0036] Optionally, the detection module is specifically used for:

[0037] Based on the memory state information, after detecting that a pointer variable is allocated in the memory region operated on by the target node, multiple memory release operations are performed on the pointer variable, and it is determined that there is a memory double release vulnerability in the code segment corresponding to the node connection path.

[0038] A third error message corresponding to the memory double-release vulnerability is generated. Based on the third error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0039] Optionally, the detection module is specifically used for:

[0040] Based on the memory state information, it is detected that a pointer variable is allocated in the memory region that operates on the target node, and after performing a memory release operation on the pointer variable, a call operation is continued to be performed on the pointer variable. It is determined that there is a vulnerability in the code segment corresponding to the node connection path where the pointer is not set to null after memory release.

[0041] A fourth error message corresponding to the vulnerability of not setting the pointer to null after memory release is generated. Based on the fourth error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0042] Optionally, the detection module is specifically used for:

[0043] Based on the memory state information, it was detected that multiple pointer variables were allocated in the memory region operated on by the target node. After performing a memory release operation on one of the multiple pointer variables, a call operation was performed on another of the multiple pointer variables. It was determined that there was a vulnerability in the code segment corresponding to the node connection path where multiple pointers pointed to the same memory region.

[0044] A fifth error message is generated indicating a vulnerability where multiple pointers point to the same memory region. Based on the fifth error message and the location information of the pointer variable in the source code, a memory vulnerability detection result is obtained.

[0045] Optionally, the source code includes custom code elements, and the device further includes an extraction module for:

[0046] For each of the multiple nodes in the target syntax tree, perform the following operations:

[0047] If a node represents a custom code element, then the first character format of the custom code element is converted into the second character format corresponding to the element type of the custom code element. Based on the syntactic meaning of the custom code element and the second character format, the syntactic function of the custom code element is recorded.

[0048] Optionally, the apparatus further includes a generation module for:

[0049] After obtaining the memory vulnerability detection results of at least one node connection path in each target node set, a detection report is generated based on at least one memory vulnerability detection result.

[0050] On one hand, an electronic device provided in this application includes a processor and a memory, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor performs the steps of any of the above-described memory vulnerability detection methods.

[0051] On one hand, embodiments of this application provide a computer-readable storage medium including a computer program, which, when run on an electronic device, causes the electronic device to perform the steps of any of the above-described memory vulnerability detection methods.

[0052] On one hand, embodiments of this application provide a computer program product, the computer program product including a computer program stored in a computer-readable storage medium; when the processor of an electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, causing the electronic device to perform the steps of any of the above-described memory vulnerability detection methods.

[0053] The solution in this application embodiment has at least the following beneficial effects:

[0054] In this embodiment, to facilitate memory vulnerability detection of the target program, the source code of the target program is converted into a target syntax tree. Then, the target syntax tree is traversed, and for nodes that meet the set conditions, the node and other nodes connected to it are grouped into a node set. Since the memory vulnerability occurs in the code segment that performs memory operations, the target node set involving memory operations is analyzed. First, the target node set containing the target node that performs memory operations is selected from the multiple node sets obtained. Then, for each target node set, the node connection path to which each target node belongs is determined, the node attributes of each node in the node connection path are analyzed, the memory state information of the memory region being operated on is determined, and then the existence of a memory vulnerability in the memory region is determined based on the memory state information.

[0055] This application embodiment can perform memory vulnerability detection on each node connection path involving memory operations. Since memory vulnerabilities originate from changes in memory state during the execution of the target program, based on the memory state information of the memory region involved in each node connection path, it is possible to accurately detect whether there are memory vulnerabilities in the corresponding memory region, thereby improving the accuracy of memory vulnerability detection.

[0056] Other features and advantages of this application will be set forth in the description which follows, and will be apparent in part from the description, or may be learned by practicing the application. The objectives and other advantages of this application may be realized and obtained by means of the structures particularly pointed out in the written description, claims, and drawings. Attached Figure Description

[0057] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:

[0058] Figure 1 This is a schematic diagram illustrating an application scenario of a memory vulnerability detection method according to an embodiment of this application;

[0059] Figure 2 This is a flowchart of a memory vulnerability detection method according to an embodiment of this application;

[0060] Figure 3 This is a schematic diagram of node connections in the target syntax tree for a code statement in an embodiment of this application;

[0061] Figure 4 This is a schematic diagram illustrating the node connections of a grammatical structure in a target syntax tree according to an embodiment of this application;

[0062] Figure 5 This is a schematic diagram of node connections in the target syntax tree for a selection syntax structure in an embodiment of this application.

[0063] Figure 6 This is a schematic diagram of node connections in the target syntax tree for a loop syntax structure in an embodiment of this application;

[0064] Figure 7 This is a schematic diagram illustrating the generation process of a memory-related control flowchart in an embodiment of this application;

[0065] Figure 8 This is a schematic diagram illustrating the generation process of a six-tuple in an embodiment of this application;

[0066] Figure 9 This is an overall flowchart of a memory vulnerability detection method according to an embodiment of this application;

[0067] Figure 10 This is a schematic diagram of the composition structure of a memory vulnerability detection device according to an embodiment of this application;

[0068] Figure 11 This is a schematic diagram of the composition structure of an electronic device using an embodiment of this application;

[0069] Figure 12 This is a schematic diagram of the composition structure of another electronic device using an embodiment of this application. Detailed Implementation

[0070] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings of the embodiments of this application. Obviously, the described embodiments are only some embodiments of the technical solutions of this application, and not all embodiments. Based on the embodiments recorded in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the technical solutions of this application.

[0071] The following describes some of the concepts involved in the embodiments of this application.

[0072] Memory vulnerabilities: The memory vulnerabilities detected in this application embodiment include memory leaks, memory double-freeing, and memory reuse after being freed.

[0073] Risk factors: Memory operation functions that may cause memory vulnerabilities, such as malloc, free, new, delete, etc.

[0074] Abstract Syntax Tree (AST): An abstract representation of the syntactic structure of source code. It represents the syntactic structure of source code in a tree-like form, where each node in the tree represents a code element in the source code.

[0075] Control flow diagram: A representation in computer science that uses graphs from mathematics to indicate all paths traversed during the execution of a computer program. In this application's embodiments, the memory-related control flow diagram is used to represent the connection relationships between nodes in a node set.

[0076] The preferred embodiments of this application are described below with reference to the accompanying drawings. It should be understood that the preferred embodiments described herein are for illustration and explanation only and are not intended to limit this application. Furthermore, the embodiments and features in the embodiments of this application can be combined with each other without conflict.

[0077] In current technologies, memory vulnerabilities are typically detected by traversing the program's source code. However, because program source code usually contains a large number of code segments involving memory operations, and its syntax is often complex, analyzing memory vulnerabilities from the source code can easily result in missed or false positives. Therefore, accurately detecting memory vulnerabilities in programs is a problem that urgently needs to be solved.

[0078] In view of this, embodiments of this application provide a memory vulnerability detection method, apparatus, electronic device, and storage medium. The method involves converting the source code of a target program into a target syntax tree (AST). Then, the AST is traversed, and for nodes that meet set conditions, a node set is formed, consisting of that node and other nodes connected to it. From the obtained node sets, a target node set containing the target node performing memory operations is selected. Next, for each target node set, the node connection path to which each target node belongs is determined. The node attributes of each node in the connection path are analyzed to determine the memory state information of the memory region being operated on. Based on this memory state information, it is then determined whether a memory vulnerability exists in that memory region. Embodiments of this application can perform memory vulnerability detection on each node connection path involving memory operations. Since memory vulnerabilities originate from changes in memory state during the execution of the target program, based on the memory state information of the memory region involved in each node connection path, the existence of memory vulnerabilities in the corresponding memory region can be accurately detected, thereby improving the accuracy of memory vulnerability detection.

[0079] like Figure 1 The diagram shown illustrates an application scenario of an embodiment of this application. The application scenario diagram includes a terminal device 110 and a server 120. The terminal device 110 and the server 120 can communicate via a communication network.

[0080] In one alternative implementation, the communication network can be a wired network or a wireless network. Therefore, the terminal device 110 and the server 120 can be directly or indirectly connected via wired or wireless communication. This application embodiment does not impose specific limitations here.

[0081] In this application embodiment, the terminal device 110 includes, but is not limited to, mobile phones, tablets, laptops, desktop computers, e-book readers, smart voice interaction devices, smart home appliances, and in-vehicle terminals. The server 120 can be a backend server corresponding to an application, or a server specifically used for memory vulnerability detection; this application does not impose specific limitations. The server 120 can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms.

[0082] It should be noted that the memory vulnerability detection methods in the various embodiments of this application can be executed by the terminal device 110 or the server 120.

[0083] In some embodiments, taking server 120 as an example, after the developer designs the target program in terminal device 110, the target program needs to be tested for memory vulnerabilities. The target program can be an application program or an operating system program, etc. The terminal device 110 can send the source code of the target program to server 120, and server 120 uses the memory vulnerability detection method of the embodiment of this application to perform memory vulnerability detection on the source code of the target program.

[0084] In other embodiments, taking terminal device 110 as an example, after the developer has designed the target program in terminal device 110, they can perform memory vulnerability detection on the target program through terminal device 110. Terminal device 110 uses the memory vulnerability detection method of the embodiments of this application to perform memory vulnerability detection on the source code of the target program.

[0085] It should be noted that, Figure 1 The examples shown are merely illustrative; in reality, the number of terminal devices and servers is unlimited and is not specifically limited in the embodiments of this application.

[0086] The memory vulnerability detection method provided by the exemplary embodiments of this application will be described below with reference to the accompanying drawings and the application scenarios described above. It should be noted that the above application scenarios are only shown to facilitate understanding of the spirit and principles of this application, and the embodiments of this application are not limited in any way.

[0087] See Figure 2 The diagram shown is an implementation flowchart of a memory vulnerability detection method provided in this application embodiment. Taking the server as the execution subject as an example, the specific implementation process of this method includes the following S21-S24.

[0088] S21. Convert the source code of the target program into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntax functions of the corresponding code element.

[0089] The source code of the target program can be written in an existing programming language, such as, but not limited to, C++ and C. This embodiment of the application can use existing conversion tools to convert the source code of the target program into a target syntax tree. For example, conversion tools include, but are not limited to, Another Tool for Language Recognition (ANTLR). ANTLR is a C / C++ compiler that can automatically generate a language recognizer and a traversal framework based on a user-defined syntax file (such as a source code file). The language recognizer generates an abstract syntax tree by performing lexical and syntactic analysis on the source code, and the traversal framework can easily traverse this abstract syntax tree. The source code file, as a character stream, is parsed into lexical symbols by the lexical analyzer, and the lexical symbols, as a token stream, are parsed into an abstract syntax tree by the syntax analyzer. Before converting the source code of the target program into a target syntax tree, non-ASCII characters and comments in the source code can be removed.

[0090] Based on the above abstract syntax tree, the target syntax tree can be obtained. The target syntax tree is a tree structure composed of multiple nodes. Each node can have a parent node or a child node, or both. A node's parent node points to other nodes, and a node's child nodes are other nodes that the node points to. Each node represents a code element in the source code, that is, a construct in the source code (such as an expression, statement, declaration, etc.). For example, an expression node represents arithmetic expressions, logical expressions, etc.; a statement node represents assignment statements, conditional statements, loop statements, etc.; a declaration node represents variable declarations, function declarations, etc.; a literal node represents literals such as strings, numbers, and boolean values; and an identifier node represents identifiers such as variable names and function names. For example, such as... Figure 3 As shown, in a code statement "int x = 1 + 2", "int", "x", "assign", "+", "1", and "2" are each treated as a node; "int" represents a variable declaration statement; "x" represents the specific part of a variable declaration, with the variable name "x"; "assign" represents an assignment operation; "+" represents a binary expression, with the operator "+"; "1" represents a numeric literal with a value of 1; "2" represents a numeric literal with a value of 2; for example, Figure 3 The connection relationships of these nodes in the target syntax tree are shown.

[0091] Each node record contains corresponding node attributes, including at least the syntactic function of its code elements. This syntactic function refers to the role of the code elements (such as operators, identifiers, functions, variables, etc.) in the source code. For example, a code element representing a function keyword (such as `fun`) might declare a function, such as `DECL_fun`. Another example is a code element representing a memory allocation keyword (such as `malloc`, `new`, etc.), which might allocate a pointer variable, represented as `malloc_pointer`. The pointer variable itself is used to store the memory address of another variable (or object).

[0092] In some embodiments, after converting the source code of the target program into a target syntax tree in S21 above, the syntactic function of the code element corresponding to each node in the target syntax tree can also be extracted. Considering that the source code usually contains some custom code elements, such as function keywords and variable keywords (including numeric variable keywords and pointer variable keywords), since this application embodiment does not need to focus on the names of function keywords and variable keywords, but mainly focuses on the role (i.e., syntactic function) of function keywords and variable keywords in the source code, in order to facilitate the recording of the syntactic function of these nodes, the character format of custom code elements of the same element type can be unified. For example, function keywords and variable keywords can be replaced with a unified character format.

[0093] For each node in the target syntax tree, perform the following operations:

[0094] If a node represents a custom code element, then the first character format of the custom code element is converted into the second character format corresponding to the element type of the custom code element. Based on the syntactic meaning of the custom code element and the second character format, the syntactic function of the custom code element is recorded.

[0095] This allows for the pre-setting of uniform character formats for custom code elements of different element types. For example, the uniform character format for numeric variable keywords is "var," and different numeric variable keywords can be numbered, such as "var1," "var2," "var3," etc. The uniform character format for pointer variable keywords is "pointer," and different pointer variable keywords can be numbered, such as "pointer1," "pointer2," "pointer3," etc. Similarly, the uniform character format for function keywords is "fun," and different function keywords can be numbered, such as "fun1," "fun2," "fun3," etc.

[0096] Specifically, the process can traverse each node in the initial syntax tree. For each node representing a custom code element, the first character format of the custom code element is converted to the second character format according to the pre-defined unified character format (i.e., the second character format) corresponding to the element type of that custom code element. Furthermore, different custom code elements of the same element type are distinguished by different numbers. For example, if a custom code element is Judge (representing a function), Judge is converted to fun1; if another custom code element is Add (representing a function), Add is converted to fun2. Similarly, if a custom code element is x (representing a numeric variable), Judge is converted to var1; if another custom code element is y (representing a numeric variable), data is converted to var2.

[0097] In this embodiment, for each node representing a custom code element, the character format of the custom code element can be converted, and its syntactic function can be extracted. Specifically, when extracting the syntactic function of each custom code element, appropriate function keywords can be selected from a predefined function keyword table based on the specific meaning of the custom code element, and the corresponding syntactic function can be represented based on the function keywords and a unified character format. For non-custom code elements, appropriate function keywords can be selected from a predefined function keyword table based on the syntactic meaning of the non-custom code element, and the corresponding syntactic function can be recorded based on the function keywords.

[0098] For example, Table 1 shows the process of extracting the syntactic functions of the code elements corresponding to nodes.

[0099] Table 1

[0100]

[0101] The first column, "Source Code Statements," in Table 1 above shows the program's source code statements. The second column, "Converted Source Code Statements," shows the result of character format conversion of the custom code elements in the source code. The third column, "Node Names," shows the node names corresponding to the source code statements in the target syntax tree. The fourth column, "Extracted Syntax Functions," shows the syntax functions corresponding to the nodes in the target syntax tree. From the results of the fourth column, "Extracted Syntax Functions," it can be seen that during the process of extracting syntax functions from the nodes in the target syntax tree, custom code elements in the source code (such as function keywords and variable keywords) are converted into a unified character format. For example, in the first line, "Judge" is converted to "fun1" and "x" is converted to "var 1". The syntax of "fun1" is DECL_fun1, which means that a function is declared with the Unicode character format fun1, and the specific name of the function in the source code can be ignored. The syntax of "var 1" is DECL_var1, which means that a numeric variable is declared with the Unicode character format var1, and the specific name of the numeric variable in the source code can be ignored. In the second line, "data" is converted to "pointer1". The syntax of "pointer1" is DECL_pointer1, which means that a pointer variable is declared with the Unicode character format pointer1, and the specific name of the pointer variable in the source code can be ignored.

[0102] In this embodiment of the application, in order to facilitate the extraction of the syntax functions of custom code elements, the character format of custom code elements of the same element type is unified, so as to record the syntax functions of custom code elements of the same element type according to the unified character format.

[0103] In this embodiment of the application, the node attributes of each node in the target syntax tree include not only the syntax function (which can be understood as node features), but also some or all of the following: position in the source code (such as line number), position in the target syntax tree (such as node identifier), node type, node value, and set of child nodes; wherein, the node type is the element type of the code element represented by the node, such as string, declaration keyword, etc., and the node value can be the value of the code element represented by the node itself, such as "hello world", "nullptr", etc.

[0104] For example, the node attributes of each node in the target syntax tree can be stored according to a defined data structure, which can be represented as follows:

[0105] int id; / / Node identifier

[0106] int no; / / The position of the node in the source code (e.g., line number)

[0107] string kind; / / Node type

[0108] string attr; / / Syntax function

[0109] string value; / / The value of the node

[0110] vector<ASTNode*> children; / / Collection of child nodes

[0111] Assuming one node has a node identifier of 1 and another node has a node identifier of 9, the data structures for these two nodes are represented as follows:

[0112] Node1: id=1; no=2; kind="ASSIGN"; attr="ASSIGN"; value="nullptr";

[0113] childs = {Node2, Node3};

[0114] Node9: id=9; no=4; kind=“string”; attr=“value”; value=“hello world”;

[0115] childs = {Node10, Node11}.

[0116] S22. Traverse the target syntax tree; wherein, for each node traversed, when it is determined that the node meets the set conditions according to the corresponding node attributes, search for at least one other node in the target syntax tree that has a connection relationship with the node, and obtain a set of nodes containing the connection relationship.

[0117] The target syntax tree can be traversed using a defined traversal method (e.g., depth-first traversal, breadth-first traversal, etc.). The target syntax tree contains: a set of nodes belonging to nodes related to memory operations, and a set of nodes corresponding to various syntax structures. Each set of nodes corresponding to a syntax structure can also contain nodes related to memory operations. Optionally, the defined conditions can be set to: nodes performing memory operations, or key code elements of defined syntax structures. This allows for the separation of the set of nodes containing memory operations and the set of nodes corresponding to defined syntax structures from the target syntax tree, facilitating subsequent memory vulnerability detection for each set of nodes.

[0118] In some embodiments, when S22 above determines that a node meets the set conditions based on the corresponding node attributes, it searches the target syntax tree for at least one other node that has a connection relationship with that node to obtain a set of nodes containing the connection relationship. This can include the following two cases:

[0119] In the first case, when a node is determined to be the node that performs memory operations based on the corresponding node attributes, at least one other node with a connection relationship with that node is searched in the target syntax tree to obtain a set of nodes containing the connection relationship.

[0120] Memory operations include, but are not limited to, memory allocation, memory release, and memory usage. For example, memory allocation functions include, but are not limited to, malloc, calloc, realloc, and new; memory release functions include, but are not limited to, free and delete; and memory usage functions include, but are not limited to, use. The keywords for various memory operations can be determined according to the specific programming language, and are not limited thereto. In this embodiment, the syntax function of the node performing memory operations may include the aforementioned memory operation keywords. For each traversed node, its syntax function can be used to determine whether it is a node from which memory operations are performed.

[0121] When traversing to a node performing a memory operation, the successor node of that node can be searched in the target syntax tree. This refers to the node's direct and indirect children (i.e., children of children of children), obtaining the corresponding set of nodes. Then, the next node in this set can be traversed. For example, when traversing to a node 1 performing a memory operation, if nodes 2, 3, and 4, which are connected to node 1, are found in the target syntax tree, then these nodes are combined into a set of nodes. The process then continues to traverse to the next node 5, and so on.

[0122] In the second case, when a node is determined to represent a key code element of the syntax structure based on the corresponding node attributes, the target syntax tree is searched for at least one other node contained in the syntax structure to obtain the corresponding node set.

[0123] Among them, the key code elements that define the syntax structure can be understood as the starting points for defining the syntax structure.

[0124] In some embodiments, setting a syntax structure may include any of the following: a conditional syntax structure; a selection syntax structure; or a loop syntax structure.

[0125] For example, key code elements for judging syntax structures include if, and the syntax function extracted from this key code element can be represented as cond; key code elements for selection syntax structures include switch, and the syntax function extracted from this key code element can be represented as select; key code elements for loop syntax structures include for, while, do, etc., and the syntax function extracted from this key code element can be represented as loop.

[0126] For example, the code statement corresponding to a conditional syntax structure is: if(x>0){y=1;}else{y=2;}. The node connection relationship of this code statement in the target syntax tree is as follows: Figure 4 As shown.

[0127] For example, the code statement corresponding to a selection syntax structure is: switch(x){case1:printf("One"); break; case 2:printf("Two"); break; case 3:printf("Three"); break;}, the node connection relationship of this code statement in the target syntax tree is as follows: Figure 5 As shown.

[0128] It should be noted that each of the conditional, selection, and loop structures can be nested within other structures. For example, in the following code statement, a conditional structure is nested within a loop structure: `while(a != b) { if (a > b) { a = ab;}} return a;`. The node connections in the target syntax tree for this code statement are as follows: Figure 6 As shown.

[0129] The conditional syntax, selection syntax, and loop syntax structures described above may contain nodes related to memory operations. For example, the following code snippet illustrates a conditional syntax structure that includes memory operations:

[0130] / / Use an if statement to perform conditional checks and allocate memory.

[0131] if (size > 0) {

[0132] / / Dynamically allocate memory

[0133] array = new int[size]; / / Allocate memory space for size integers

[0134] / / Initialize array elements

[0135] for(int i = 0; i <size;++i){

[0136] array[i] = i + 1; / / Assuming initialization is from 1 to size

[0137] } ...

[0139] / / Release allocated memory

[0140] if(array != nullptr){

[0141] delete[] array; / / Ensure that dynamically allocated memory is released.

[0142] array = nullptr; / / Set the pointer to nullptr to prevent dangling pointers.

[0143] }

[0144] Optionally, when the traversed node does not belong to the node performing memory operations, nor to the key code element of the above-mentioned syntax structure, the node set to which the node belongs can also be searched. This can be understood as the node set corresponding to other syntax structures. These other syntax structures are ordinary syntax structures other than judgment syntax structures, selection syntax structures, and loop syntax structures.

[0145] In this embodiment of the application, when the key code element of the set syntax structure is traversed, at least one other node contained in the set syntax structure is searched from the target syntax tree to obtain the corresponding node set. Then, the next node of the node set can be traversed.

[0146] Optionally, for each node set, the connection relationships between the nodes in the node set can be stored; for example, the node set can be stored using a Memory Related Control Flowchart (MRCFG).

[0147] First, we define the mathematical definition of the memory MRCFG as follows: MRCFG = (N, E, Entry, Exit, σ). Here, N is the set of nodes in the MRCFG, N = {n1, n2, n3, ..., n}. m}, n i It is the current node in MRCFG, 1≤i≤m, where m represents the number of nodes, and n i = {id, no, PRE_ID, NEXT_ID}, where id is the current node n i In MRCFG, node identifier 'no' represents the current node 'n'. i At the corresponding location in the source code (e.g., line number), PRE_ID is all pointers to the current node n. iThe set of node identifiers of the predecessor node, PRE_ID = {id1, id2, ..., id...} k}, k∈[1,m]. NEXT_ID is the current node n i The set of successor nodes pointed to, NEXT_ID = {id1, id2, ..., id...} k}, k∈[1,m]. For node n in MRCFG, we have E is the set of edges in an MRCFG, representing the pointing relationships between nodes in the MRCFG. E = {<n1,n2> ,<n2,n3> ,…, <n m-1 ,n m >}, where,<n1,n2> This is represented as ordered pairs of nodes, with the pointer relationship n1→n2. Entry is the entry node of the MRCFG, Exit is the exit node of the MRCFG, and σ is the set of hazard factors. Hazard factors refer to nodes that perform memory operations (such as memory allocation, memory deallocation, etc.). For ε in the MRCFG, we have... σ={ε1,ε2,ε3,…,ε k}, where k is the number of hazard factors; if MRCFG does not contain any hazard factors, then σ is empty.

[0148] The following section explains the executable path, predecessor node, successor node, and dependencies involved in MRCFG.

[0149] Executable path: In MRCFG, if there exists a node sequence P = {n1, n2, n3, ..., n} k In this context, a node can point to one or more other nodes (i.e., child nodes), and any adjacent node n... i ,n i+1 It has a directional relationship, that is n i+1 If >∈E, then P is an executable path in MRCFG, denoted as P(n). i ,n j ).

[0150] An executable path specifically refers to the sequence of AST nodes that a program will actually traverse during execution, depending on the choice of conditional branches (such as if statements and switch statements), loop structures (such as for and while statements), and other control flow statements. Each executable path corresponds to a possible behavior pattern of the program during runtime. An MRCFG can contain one or more executable paths; for example, using... Figure 6Taking the loop syntax structure shown as an example, this loop syntax structure includes two executable paths. One executable path is: in the while loop, if a and b are not equal, the branch is entered. First, the if condition is executed. If a is greater than b, the assign operation is executed, assigning the result of ab to a. The other executable path is: if a and b are equal, the while loop is exited. Outside the while loop, the return operation is executed, returning the value of a.

[0151] Predecessor and successor nodes: In MRCFG, there exists an executable path P(n i ,n j If n i is n j The predecessor node, denoted as PRE(n i ,n j ), n j is n i The successor node is denoted as SUC(n). j ,n i ).

[0152] It should be noted that a node's predecessor node can be its direct parent node or indirect parent node (i.e., the parent node's parent node), and a node's successor node can be its direct child node or indirect child node (i.e., the child node's child node).

[0153] A necessary successor node: if n j is n i The successor node, regardless of how many branches exist in between, after n i It will definitely go through n. j Then n is called j is n i The necessary successor node, denoted as CV(n) j ,n i ).

[0154] Dependency: There exists a dependency from n i to n j The executable path P, the pointer variable pointer in n i Defined at n j If the node n is not redefined, then the node is called a node n. j Depends on node n i , denoted as DEP(n) i ,n j (pointer).

[0155] Secondly, define the data structure of nodes in the MRCFG. The data structure of a node in the MRCFG can include: id representing the node identifier of a node in the MRCFG, no representing the line number of a node in the source code, PRE_ID representing the set of node identifiers pointing to the predecessor nodes of a node, NEXT_ID representing the set of all next nodes pointed to by a node, risks representing the risk factor characteristics of a node (i.e., the syntactic functions in the above embodiment), isCond representing whether a node is the starting node of a judgment syntax structure (referred to as the judgment node), isSelect representing whether a node is the starting node of a selection syntax structure (referred to as the selection node), isLoop representing whether a node is the starting node of a loop syntax structure (referred to as the loop node), isEntry representing whether a node is the MRCFG entry node, and isExit representing whether a node is the MRCFG exit node. For example, the data structure of a node in the MRCFG is represented as follows:

[0156] int id; / / Node identifier of a node

[0157] int no; / / Line number of a node in the source code

[0158] vector <string>PRE_ID; / / A set of node identifiers pointing to the predecessor node of a given node.

[0159] vector <string>NEXT_ID; / / A set of successor nodes pointed to by a node.

[0160] vector <string>risks; / / Risk factor characteristics of a node

[0161] bool isCond = false; / / Not checking for nodes

[0162] bool isSelect = false; / / Not selecting a node.

[0163] bool isLoop = false; / / Not a loop node

[0164] bool isEntry = false; / / Not an MRCFG entry node

[0165] bool isExit = false; / / Not an MRCFG exit node

[0166] When using the above-mentioned MRCFG storage node set, such as Figure 7 As shown, during the traversal of the target syntax tree, if a node performing memory operations (such as malloc, realloc, new, free, etc.) is encountered, the successor node of that node can be obtained, generating a memory-related control flow diagram of a normal structure; if a decision node is encountered, the successor node of that node can be obtained, generating a memory-related control flow diagram of a decision structure; if a selection node is encountered, the successor node of that node can be obtained, generating a memory-related control flow diagram of a selection structure; if a loop node is encountered, the successor node of that node can be obtained, generating a memory-related control flow diagram of a loop structure.

[0167] In addition, when traversing to nodes that do not perform memory operations, are not loops, are selected, or are judged, the successor node of that node can also be obtained to generate a memory-related control flow chart of a normal structure.

[0168] In this embodiment, the memory-related control flowchart not only includes the flow relationship between code statements, but also the information of nodes related to memory operations, which can clearly show all control dependencies and data dependencies related to memory operations in the source code.

[0169] S23. Select at least one target node set from the obtained multiple node sets, each target node set containing at least one target node representing the execution of memory operations.

[0170] On one hand, after generating multiple node sets, for each node set, it can be checked whether the target node is contained in that node set. On the other hand, during the traversal of the target syntax tree in S22 above, for each node set obtained, it can be recorded whether the node set contains the target node to perform the memory operation.

[0171] In some embodiments, in S22 above, for each traversed node, after searching the target syntax tree to find at least one other node that has a connection relationship with the node and obtaining the corresponding node set, the target node subset contained in the node set can also be recorded; wherein, when a node set contains a target node, the target node subset contains one or more target nodes, and when a node set does not contain a target node, the target node subset is empty.

[0172] Optionally, when recording a subset of target nodes included in the node set, the node identifier of each target node in the subset of target nodes can be recorded.

[0173] At this point, in S23 above, when selecting at least one target node set containing the target node from the multiple node sets obtained, the following operations can be performed on each of the multiple node sets: if a node set contains a subset of target nodes that is not empty, then that node set is determined to be the target node set.

[0174] Specifically, when storing the node set through the memory-related control flowchart described above, the target node subset can be understood as the set of hazard factors in the memory-related control flowchart described above.

[0175] In this embodiment of the application, during the traversal of the target syntax tree, each time a set of nodes is obtained, the subset of target nodes contained in that set can be recorded, thereby more quickly filtering out the target set of nodes from multiple set of nodes. Furthermore, subsequent memory vulnerability detection only needs to be performed on the target set of nodes, thus improving the efficiency of memory vulnerability detection.

[0176] S24. For each target node in each target node set, perform the following operations: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtain the corresponding memory vulnerability detection results.

[0177] In this embodiment of the application, considering that the cause of memory vulnerabilities is the change in memory state during program execution, and that the target nodes in the target node set are the nodes that perform memory operations, the node connection path (such as the executable path in the above embodiment) of each target node in each target node set can be determined to obtain the memory state information of the memory region operated by the target node.

[0178] In some embodiments, when analyzing the node attributes of each node in the node connection path to which the target node belongs in S24 above to obtain the memory state information of the memory region operated by the target node, the node connection path to which the target node belongs can be obtained from the target node set to which the target node belongs; based on the node attributes of each node in the node connection path, the pointer variable information pointing to the memory region operated by the target node and the pointer variable release information of the memory region are determined; based on the pointer variable information pointing to the memory region and the pointer variable release information of the memory region, the memory state information of the memory region operated by the target node is obtained.

[0179] In the process of traversing the target node set, when a target node is encountered, the node connection path to which the target node belongs (i.e., the executable path in the above embodiment) is searched in the target node set. Then, the syntax function of each node in the node connection path is analyzed to determine the memory region involved in the memory operation performed by the target node, the pointer variable information pointing to the memory region (such as the name of the pointer variable, the value of the pointer variable, the number of pointer variables, etc.), and the pointer variable release information of the memory region (such as the number of times it has been released).

[0180] In this embodiment, on the one hand, considering that the main cause of memory vulnerabilities is the change in memory state during program execution, a method that only focuses on pointer data streams cannot effectively and intuitively analyze whether memory vulnerabilities exist in the program. On the other hand, the cause of memory state changes may be caused by pointer variables. In order to establish the relationship between pointer variables and the memory regions they operate on, and to simultaneously focus on the information of pointer variables and the information of the memory regions they operate on, this embodiment uses a data structure to store the memory state information of the memory regions operated on by the target node. Optionally, the data structure can be a six-tuple, which will be described below.

[0181] First, define the data structure for a six-tuple as follows: six_tuple = {n i ,pointer / value,F,RegionNum,Pcnt,FreeTimes}, six_tuple represents a six-tuple, n i This section represents the node identifier of the target node. `Pointer / value` represents the name and value of the pointer variable within the target node. The pointer variable name identifies the specific pointer variable, and the value identifies the memory region (i.e., memory address) it points to. `F` represents the constraints between nodes, recording dependencies, predecessor nodes, successor nodes, etc., along the executable path. `RegionNum` represents the identifier of the memory region pointed to by the pointer variables in the target node. `Pcnt` represents the number of pointer variables pointing to `RegionNum`, and `FreeTimes` represents the number of times the pointer variables corresponding to the memory region of `RegionNum` have been freed.

[0182] Secondly, we design a method for constructing six-tuples. For example... Figure 8 As shown, based on each target node set obtained in the above embodiments of this application, each node in the target node set is traversed. When a target node performing memory operations (such as memory operation functions like malloc, realloc, new, and free) is encountered, the syntax structure of the code statement containing the target node is determined. If it is a judgment syntax structure, the six-tuple information corresponding to the target node in the judgment syntax structure is extracted, including the node identifier of the target node, the name and value of the pointer variable in the target node, the constraint relationship, the identifier number of the memory region, the number of pointer variables pointing to the memory region, and the number of times the pointer variables of the memory region are released, etc., and added to the six-tuple corresponding to the target node. If it is a selection syntax structure, the six-tuple information corresponding to the target node in the selection syntax structure is extracted and added to the six-tuple corresponding to the target node. If it is a loop syntax structure, the six-tuple information corresponding to the target node in the loop syntax structure is extracted and added to the six-tuple corresponding to the target node. If it is a normal syntax structure, the six-tuple information corresponding to the target node in the normal syntax structure is extracted and added to the six-tuple corresponding to the target node.

[0183] After obtaining the memory state information corresponding to the target nodes in each target node set, memory vulnerability detection rules can be set to perform memory vulnerability detection on the node connection paths, obtaining the corresponding memory vulnerability detection results. The vulnerability detection rules can be configured according to the memory vulnerabilities to be detected. For example, memory vulnerabilities include memory leaks, double-free vulnerabilities, and memory read / write vulnerabilities after freeing. Different vulnerability detection rules are set for different memory vulnerabilities. The memory state information corresponding to the target nodes is matched with the constraints of the different vulnerability detection rules to determine whether a memory vulnerability exists and, if so, the type of vulnerability. When the memory vulnerability detection result indicates the presence of a memory vulnerability, it specifically includes the type of memory vulnerability and the location where the memory vulnerability occurs (e.g., the line number of the node in the source code).

[0184] In this embodiment, to facilitate memory vulnerability detection of the target program, the source code of the target program is converted into a target syntax tree. Then, the target syntax tree is traversed, and for nodes that meet the set conditions, a node set is formed, along with other nodes connected to that node. Since memory vulnerabilities occur in code segments that perform memory operations, the target node set involving memory operations is analyzed. First, a target node set containing the target node performing memory operations is selected from the multiple node sets obtained. Next, for each target node set, the node connection path to which each target node belongs is determined, and the node attributes of each node in the connection path are analyzed to determine the memory state information of the memory region being operated on. Based on this memory state information, it is then determined whether a memory vulnerability exists in that memory region. This embodiment can perform memory vulnerability detection on each node connection path involving memory operations. Since memory vulnerabilities originate from changes in memory state during the execution of the target program, based on the memory state information of the memory region involved in each node connection path, it is possible to accurately detect whether a memory vulnerability exists in the corresponding memory region, thereby improving the accuracy of memory vulnerability detection.

[0185] The memory vulnerabilities detected in this application's embodiments fall into three main categories: memory leak vulnerabilities, memory double-free vulnerabilities, and memory read / write vulnerabilities after memory freeing. Memory leak vulnerabilities include two scenarios: memory not being freed after allocation and pointer relocation. Memory read / write vulnerabilities after memory freeing include two scenarios: pointers not being set to null after memory freeing and multiple pointers pointing to the same memory region.

[0186] The following embodiments of this application will describe the detection process of various memory vulnerabilities. First, the memory operation functions involved in the detection process will be defined.

[0187] Memory allocation function (MemoryAllocation): using MA(n i (,pointer) is used to represent the node n i The variable pointer has been allocated memory. For example, memory allocation functions in C / C++ include malloc, calloc, realloc, new, and new[].

[0188] Declare a pointer (DefinePointer): Use DPO(n i (,pointer) is used to represent the node n x The pointer or object was declared there.

[0189] The memory release function (MemoryFree) uses MF(n) to free memory. x (Pointer) is used to represent n x The pointer variable `pointer` was released at this location. Memory release functions include `free`, `delete`, and `delete[]`.

[0190] Using pointers (UsePointer): Use Use(n x (,pointer) is used to represent the node n x A pointer variable named "pointer" was used at that location.

[0191] The following examples first introduce the detection process for memory leaks (including memory allocation failures and pointer relocation vulnerabilities).

[0192] In C / C++ programs, forgetting to free memory regions after allocating them will cause memory leaks. Frequent memory leaks can rapidly consume memory resources, leading to system crashes and shutdowns. Therefore, memory leaks are the leading cause of program crashes.

[0193] In some embodiments, the memory vulnerability detection performed on the node connection path in S24 above, based on memory state information and set vulnerability detection rules, to obtain the corresponding memory vulnerability detection results, includes:

[0194] Based on memory state information, it was detected that after allocating a pointer variable in the memory region operated on by the target node, no memory release operation was performed on the pointer variable. It was determined that the code segment corresponding to the node connection path had a memory allocation failure vulnerability. The first error message of the memory allocation failure vulnerability was generated. Based on the first error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result was obtained.

[0195] This application embodiment designs a first vulnerability detection rule for the vulnerability of memory allocation not being released after allocation, as follows:

[0196]

[0197] Where, n i n j Belonging to the set of nodes N and the executable path P, n i is n j The predecessor node, n j is n i The successor node of n, and n j is n i The necessary path, at node n x The pointer variable pointer is declared at n, and in n y A memory region has been allocated at n. i and n j This indicates a dependency relationship. At this point, it can be determined that the allocated memory area has not been released.

[0198] Based on the first vulnerability detection rule mentioned above, the system checks for memory allocation failures. Specifically, it first defines a data structure for the memory vulnerability detection results (e.g., error), which can include the line number where the vulnerability occurred and the vulnerability type. Then, it defines a stack, which stores the memory state information (e.g., the six-tuple in the above embodiment) of the target nodes in the target node set (e.g., the memory-related control flow diagram). For each target node in the target node set, the system performs memory vulnerability detection on its executable path. Specifically, when it is determined that the target node is the last node in the target node set, based on the memory state information, it detects that the target node performed an MA (memory allocation) operation on a pointer variable but did not subsequently perform an MF (memory release) operation. In this case, the first type of memory leak can be identified: a memory allocation failure vulnerability. The system reads the location of the target node (e.g., the line number in the source code), adds the error information of the memory allocation failure vulnerability (i.e., the first error information mentioned above) and the location of the target node to the data structure of the memory vulnerability detection results.

[0199] In some embodiments, the memory vulnerability detection performed on the node connection path in S24 above, based on memory state information and set vulnerability detection rules, to obtain the corresponding memory vulnerability detection results, includes:

[0200] Based on memory state information, after allocating a pointer variable to the memory region operated on by the target node, it is detected that there is no pointer variable pointing to the memory region. It is determined that there is a pointer relocation vulnerability in the code segment corresponding to the node connection path. A second error message for the pointer relocation vulnerability is generated. Based on the second error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0201] This application embodiment designs a second vulnerability detection rule for pointer relocation vulnerabilities, as follows:

[0202]

[0203] Where, n i ,n j ,n k Belonging to the set of nodes N and the executable path P, n i is n j The predecessor node, n j is n i The successor node of n, and n j is n i The necessary path, at node n k At this time, there exists a memory region, namely RegionNum, which is not -1, but no pointer variable points to this memory region. The count Pcnt of the pointer variable in this memory region is 0, indicating that after the memory region is allocated, the pointer variable pointing to this memory region points to other memory regions, causing the memory region at this location to be unable to be released, thus causing a memory leak.

[0204] Based on the second vulnerability detection rule mentioned above, the existence of pointer relocation vulnerabilities is detected. Specifically, firstly, a data structure (e.g., error) for the memory vulnerability detection results is defined. This data structure can include the location where the vulnerability occurs (e.g., line number in the source code) and the type of vulnerability. Then, a stack is defined, which stores the memory state information (e.g., the six-tuple in the above embodiment) of the target nodes in the target node set (e.g., memory-related control flow diagram). For the executable path where each target node in the target node set is located, memory vulnerability detection is performed. Specifically, based on the memory state information, it is determined that the pointer variable corresponding to the target node points to a memory region, but the count of the pointer variable pointing to this memory region is currently 0, and no memory release operation has been performed on the pointer variable. At this time, the second type of memory leak can be determined: pointer relocation vulnerability. The location of the target node (e.g., line number in the source code) is read, and the error information of the pointer relocation vulnerability (i.e., the second error information mentioned above) and the location of the target node are added to the data structure of the memory vulnerability detection results.

[0205] This application's embodiments analyze memory leak vulnerabilities, propose precise definitions of these vulnerabilities, and formally describe their characteristics, clearly expressing their essential features. Based on the six-tuple of target nodes obtained in the above embodiments, the nodes on the executable path in the memory-related control flow chart are judged, thereby detecting whether there are memory vulnerabilities in the code segment corresponding to the executable path.

[0206] The following section will introduce the detection process for the memory double-free vulnerability.

[0207] In programs like C / C++, a single memory region can only be freed once. If the memory region is freed once and then the memory release function is called again, an error will occur. For example, if no constructor is defined, the compiler will automatically generate a default constructor. If the class contains pointer variables pointing to the memory region, calling the default copy constructor will result in a shallow copy, meaning both pointer variables will actually point to the same memory region. Releasing these two pointer variables will cause system errors or even system crashes.

[0208] In some embodiments, the memory vulnerability detection performed on the node connection path in S24 above, based on memory state information and set vulnerability detection rules, to obtain the corresponding memory vulnerability detection results, includes:

[0209] Based on memory state information, after detecting that a pointer variable is allocated in the memory region operated on by the target node, multiple memory release operations are performed on a pointer variable to determine that the code segment corresponding to the node connection path has a memory double release vulnerability; a third error message corresponding to the memory double release vulnerability is generated, and the memory vulnerability detection result is obtained based on the third error message and the location information of the code segment corresponding to the node connection path in the source code.

[0210] This application's embodiments design a third vulnerability detection rule for memory double-free vulnerabilities, as follows:

[0211]

[0212] Where, n i n j Belonging to the set of nodes N and the executable path P, n i is n j The predecessor node, n j is n i The successor node of n, and n j is n i The necessary path, at node n x The pointer variable pointer is declared at n, and in n y A memory region was allocated at n. u The memory region was released at point n. i and n j There is a dependency relationship, but the memory area pointed to by the pointer variable pointer has been freed more than once. In this case, it is determined that the memory has been freed repeatedly.

[0213] Based on the aforementioned third vulnerability detection rule, the existence of a memory double-free vulnerability is detected. Specifically, firstly, a data structure (e.g., error) for the memory vulnerability detection result is defined. This data structure can include the location of the vulnerability (e.g., line number in the source code) and the type of vulnerability. Secondly, a stack is defined, which stores the memory state information (e.g., the six-tuple in the above embodiment) of the target nodes in the target node set (e.g., memory-related control flow diagram). For each target node in the target node set, memory vulnerability detection is performed on the executable path. Specifically, based on the memory state information, if the number of times the pointer variable corresponding to the target node is released is greater than 1, a double-free vulnerability can be determined. The location of the target node (e.g., line number in the source code) is read, and the error information of the memory double-free vulnerability (i.e., the aforementioned third error information) and the location of the target node are added to the data structure of the memory vulnerability detection result.

[0214] This application analyzes memory double-release vulnerabilities, proposes precise definitions of these vulnerabilities, and formally describes their characteristics, clearly expressing their essential features. Based on the six-tuple of target nodes obtained in the above embodiments, the nodes on the executable path in the memory-related control flow chart are judged, thereby detecting whether there are memory vulnerabilities in the code segment corresponding to the executable path.

[0215] The memory read / write vulnerability involved in the embodiments of this application includes the following two situations: the vulnerability where the pointer is not set to null after memory is released, and the vulnerability where multiple pointers point to the same memory area. These two situations will be described in detail below.

[0216] In some embodiments, the memory vulnerability detection performed on the node connection path in S24 above, based on memory state information and set vulnerability detection rules, to obtain the corresponding memory vulnerability detection results, includes:

[0217] Based on memory state information, it is detected that a pointer variable is allocated in the memory region where the target node is operated on. After performing a memory release operation on a pointer variable, a call operation is performed on another pointer variable. It is determined that there is a vulnerability in the code segment corresponding to the node connection path where the pointer is not set to null after memory release. A fourth error message corresponding to the vulnerability of not setting the pointer to null after memory release is generated. Based on the fourth error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0218] This application embodiment designs a fourth vulnerability detection rule for the vulnerability where the pointer is not set to null after memory is freed, as follows:

[0219] Fourth vulnerability detection rule:

[0220] Where, n i ,n j ,n k Belonging to the set of nodes N and the executable path P, n i is n j The predecessor node, n j is n k The predecessor node, n j is n i The successor node, n k is n j The successor node, at node n i A pointer variable `pointer` is declared at node n, and then... i A memory region was allocated for the pointer variable pointer at node n. j The memory pointed to by the pointer variable pointer was freed at point n, but the memory was not freed at node n. k The pointer variable `pointer` was used at this point. At this point, an error is detected where memory was freed and then read / written again.

[0221] Based on the aforementioned fourth vulnerability detection rule, the system checks for vulnerabilities where pointers are not nulled after memory release. Specifically, a data structure `error` is first defined for the memory vulnerability detection results. `error` contains the location of the vulnerability (e.g., line number in the source code) and the type of vulnerability. A stack is then defined, storing the memory state information (e.g., the six-tuple in the above embodiment) of the target nodes in the target node set (e.g., the memory-related control flow diagram). For each target node in the target node set, memory vulnerability detection is performed on its executable path. Specifically, based on the memory state information, if it is detected that the pointer variable corresponding to the target node has undergone memory release but has not been nulled (Nullptr), the system continues to obtain the successor node of the target node. If the successor node continues to use the pointer variable, it can be determined that the first type of vulnerability (a memory release followed by read / write) exists: a memory release followed by pointer not nulled vulnerability. The location of the target node (e.g., line number in the source code) is read, and the error information of the memory release followed by pointer not nulled vulnerability (i.e., the aforementioned fourth error information) and the location of the target node are added to the data structure of the memory vulnerability detection results.

[0222] In some embodiments, the memory vulnerability detection performed on the node connection path in S24 above, based on memory state information and set vulnerability detection rules, to obtain the corresponding memory vulnerability detection results, includes:

[0223] Based on memory state information, it was detected that multiple pointer variables were allocated in the memory region operated on by the target node. After performing a memory release operation on one of the pointer variables, the call operation was performed on another of the pointer variables. It was determined that there was a vulnerability in the code segment corresponding to the node connection path where multiple pointers pointed to the same memory region. A fifth error message for the vulnerability of multiple pointers pointing to the same memory region was generated. Based on the fifth error message and the location information of a pointer variable in the source code, the memory vulnerability detection result was obtained.

[0224] This application's embodiments design a fifth vulnerability detection rule for multiple pointers pointing to the same memory vulnerability, as detailed below:

[0225] When a pointer variable frees the memory area pointed to by a pointer variable using a memory release function, if the pointer variable is not set to null (i.e., nullptr), the pointer will become a dangling pointer, pointing to an unknown memory area. If the freed memory area is used by malicious code to construct data, this will lead to serious consequences.

[0226] Fifth vulnerability detection rule:

[0227] Where, n i1 ,n i2 ,n j ,n k Belonging to the set of nodes N and the executable path P, n i1 and n i2 is n j The predecessor node, n j is n k The predecessor node, n j is n i1 and n i2 The successor node, n k is n j The successor node, at node n i1 The pointer variable pointer1 is declared at n. i1 A memory region was allocated for the pointer variable pointer1 at node n. i2 A pointer variable pointer2 is declared at point n. At this point, pointer1 and pointer2 point to the same memory region, node n. j The memory release function was called at that point to release the pointer variable pointer. x x = {1, 2} means that either pointer1 or pointer2 has been released at node n. k Pointer variable used at this location x x = {2, 1}. At this point, the pointer... x Pointing to an unknown region in the heap memory indicates an error occurred after memory was freed and read / write was attempted again.

[0228] Based on the fifth vulnerability detection rule mentioned above, the system detects whether there are vulnerabilities where multiple pointers point to the same memory region. Specifically, first, a data structure (e.g., error) for the memory vulnerability detection result is defined. This data structure can include the location of the vulnerability (e.g., line number in the source code) and the type of vulnerability. Then, a stack is defined, which stores the memory state information (e.g., the six-tuple in the above embodiment) of the target nodes in the target node set (e.g., memory-related control flow diagram). For the executable path where each target node in the target node set is located, memory vulnerability detection is performed. Specifically, based on the memory state information, if it is detected that the memory region operated on by the target node is pointed to by multiple pointer variables, and one of the pointer variables has been released while another pointer variable has generated a pointer usage operation, then the second case of a memory release followed by read / write vulnerability can be determined: a vulnerability where multiple pointers point to the same memory region. The location of the target node (e.g., line number in the source code) is read, and the error information of the vulnerability where multiple pointers point to the same memory region (i.e., the fifth error information mentioned above) and the location of the target node are added to the data structure of the memory vulnerability detection result.

[0229] This application analyzes the vulnerability of re-reading and writing memory after it has been freed, proposes a precise definition of these vulnerabilities, and formally describes the characteristics of these vulnerabilities, which can clearly express the essential characteristics of these vulnerabilities. Based on the six-tuple of the target node obtained in the above embodiments, the nodes on the executable path in the memory-related control flow diagram are judged, thereby detecting whether there are memory vulnerabilities in the code segment corresponding to the executable path.

[0230] In some embodiments, after obtaining the memory vulnerability detection results for at least one node connection path in each set of target nodes, a detection report can be generated based on at least one memory vulnerability detection result. This detection report may include: error information for each detected memory vulnerability and the location of the target node where the memory vulnerability occurred. Furthermore, the detection report may also be displayed.

[0231] Figure 9 A flowchart illustrating the overall process of a memory vulnerability detection method according to an embodiment of this application is shown.

[0232] like Figure 9 As shown, the overall process of the memory vulnerability detection method includes the following steps S91-S101:

[0233] S91. Import the source code of the target program.

[0234] S92. Generate the target syntax tree.

[0235] Specifically, the source code is compiled with GCC and then lexical analysis is performed to generate the target syntax tree.

[0236] S93, Traverse the target syntax tree.

[0237] S94. Determine whether a node related to memory operation has been traversed. If yes, execute S98; otherwise, execute S95.

[0238] S95. Determine if the conditional syntax structure has been traversed. If yes, execute S98; otherwise, execute S96.

[0239] S96. Determine if the selection syntax structure has been traversed. If yes, execute S98; otherwise, execute S97.

[0240] S97. Determine if a loop syntax structure has been encountered. If so, execute S98.

[0241] S98. Generate the corresponding memory-related control flowchart.

[0242] The process of generating the memory-related control flowchart is described in the above embodiments of this application.

[0243] S99. Generate a six-tuple of the target node.

[0244] Specifically, after traversing to the target node in the memory-related control flow graph, a six-tuple of the target node is generated.

[0245] S100, memory vulnerability detection based on six-tuples.

[0246] Specifically, based on the memory state information corresponding to the target node in the six-tuple, and combined with the set vulnerability detection rules, memory vulnerability detection is performed on the corresponding node connection path.

[0247] S101. Generate a test report.

[0248] The memory vulnerability detection method of this application can be applied to any scenario requiring memory vulnerability detection of the source code of a target program. The application scenarios for memory vulnerability detection in source code are very broad, especially in the development and maintenance of high-performance, security-critical applications. Memory vulnerabilities can lead to program crashes, data leaks, or malicious exploitation; therefore, rigorous memory vulnerability detection throughout the software development lifecycle is crucial. The following are some typical application scenarios:

[0249] 1. Embedded System Development Scenario: Embedded devices (such as routers, automotive control systems, and medical devices) typically have limited resources and extremely high reliability requirements. Improper memory management can lead to device instability or complete failure, potentially causing serious security issues. Memory vulnerability detection ensures that all dynamically allocated memory is correctly released.

[0250] 2. Operating System Kernel Programming Scenario: The operating system kernel directly manages hardware resources, and any error in memory management can lead to system instability. Kernel space does not allow garbage collection, so programmers must manually manage memory. Memory vulnerability detection ensures the security and stability of the kernel.

[0251] 3. Network Services and Web Application Scenarios: Web servers and online service platforms handle a large number of concurrent requests, requiring efficient memory management and release. Long-running services with memory leaks will gradually consume more memory, eventually leading to service unavailability. Memory leak detection involves regularly scanning the codebase to find unreleased memory.

[0252] 4. Game Development Scenario: Video games often involve complex graphics rendering and physics simulation logic, all of which rely on meticulous memory management. Frequent object creation and destruction during game execution can easily lead to memory fragmentation or leaks. Memory vulnerability detection involves regularly scanning the codebase to find unreleased memory.

[0253] 5. Financial Trading Systems - Application Scenarios: High-frequency trading and other financial applications require extremely high response speeds and accuracy. Even minor memory management flaws can impact trading speed and even cause financial losses. Memory vulnerability detection ensures that each module correctly handles memory allocation and deallocation.

[0254] 6. Mobile Application Development Scenarios: Applications on smartphones and tablets often face memory limitations. Due to limited device memory, improper memory usage can negatively impact user experience and even cause application crashes. Memory vulnerability detection can improve memory usage efficiency.

[0255] Based on the same inventive concept, this application also provides a memory vulnerability detection device. The principle of this device in solving the problem is similar to the method in the above embodiments. Therefore, the implementation of this device can refer to the implementation of the above method, and the repeated parts will not be described again.

[0256] like Figure 10 The diagram shown is a structural schematic of a memory vulnerability detection device 1000, which includes:

[0257] The conversion module 1001 is used to convert the source code of the target program into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntactic functions of the corresponding code element.

[0258] Traversal module 1002 is used to traverse the target syntax tree; wherein, for each node traversed, when a node is determined to meet the set conditions according to the corresponding node attributes, at least one other node with a connection relationship with the node is searched from the target syntax tree to obtain a set of nodes containing the connection relationship.

[0259] The filtering module 1003 is used to filter at least one target node set from the obtained multiple node sets, each target node set containing at least one target node representing the execution of memory operations;

[0260] The detection module 1004 is used to perform the following operations for each target node in each target node set: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtain the corresponding memory vulnerability detection results.

[0261] In some embodiments, the traversal module 1002 is specifically used for:

[0262] When a node is determined to be the node that performs memory operations based on the corresponding node attributes, at least one other node with a connection relationship with that node is searched from the target syntax tree to obtain a set of nodes containing the connection relationship.

[0263] When a node is determined to represent a key code element of the defined syntax structure based on its corresponding node attributes, the target syntax tree is searched for at least one other node contained in the defined syntax structure to obtain the corresponding node set.

[0264] In some embodiments, the syntax structure includes any of the following:

[0265] Determine the grammatical structure;

[0266] Choose the syntax structure;

[0267] Loop syntax structure.

[0268] In some embodiments, the apparatus further includes a recording module for recording a subset of target nodes contained in each node set;

[0269] The filtering module is specifically used to perform the following operations for multiple node sets: if a node set contains a non-empty subset of target nodes, then determine a node set as the target node set.

[0270] In some embodiments, the detection module 1004 is specifically used for:

[0271] Obtain the node connection path to which the target node belongs from the set of target nodes to which the target node belongs;

[0272] Based on the node attributes of each node in the node connection path, determine the pointer variable information of the memory region to be operated on by the target node, as well as the pointer variable release information of the memory region;

[0273] Based on the pointer variable information pointing to the memory region and the pointer variable release information of the memory region, the memory state information of the memory region operated on by the target node is obtained.

[0274] In some embodiments, the detection module 1004 is specifically used for:

[0275] Based on memory state information, it was detected that after allocating a pointer variable in the memory region operated on by the target node, no memory release operation was performed on the pointer variable. It was determined that the code segment corresponding to the node connection path had a memory allocation failure vulnerability.

[0276] The system generates the first error message for the memory allocation failure vulnerability. Based on the first error message and the location information of the code segment corresponding to the node connection path in the source code, it obtains the memory vulnerability detection result.

[0277] In some embodiments, the detection module 1004 is specifically used for:

[0278] Based on memory state information, after it was detected that a pointer variable was allocated to the memory region operated on by the target node, there was no pointer variable pointing to the memory region. It was determined that the code segment corresponding to the node connection path had a pointer relocation vulnerability.

[0279] A second error message for the pointer relocation vulnerability is generated. Based on the second error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0280] In some embodiments, the detection module 1004 is specifically used for:

[0281] Based on memory state information, after detecting that a pointer variable is allocated in the memory region that operates on the target node, and multiple memory release operations are performed on a single pointer variable, it is determined that the code segment corresponding to the node connection path has a memory double release vulnerability.

[0282] A third error message corresponding to the memory double-free vulnerability is generated. Based on the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0283] In some embodiments, the detection module 1004 is specifically used for:

[0284] Based on memory state information, it was detected that a pointer variable was allocated in the memory region where the target node was operated on. After performing a memory release operation on a pointer variable, a call operation was performed on another pointer variable. It was determined that the code segment corresponding to the node connection path had a vulnerability where the pointer was not set to null after memory release.

[0285] A fourth error message corresponding to the vulnerability of not setting the pointer to null after memory release is generated. Based on the fourth error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

[0286] In some embodiments, the detection module 1004 is specifically used for:

[0287] Based on memory state information, it was detected that multiple pointer variables were allocated in the memory region operated on by the target node. After performing a memory release operation on one of the multiple pointer variables, the call operation was performed on another of the multiple pointer variables. It was determined that there was a vulnerability in the code segment corresponding to the node connection path where multiple pointers pointed to the same memory region.

[0288] A fifth error message is generated indicating a vulnerability where multiple pointers point to the same memory region. Based on the fifth error message and the location information of a pointer variable in the source code, the memory vulnerability detection result is obtained.

[0289] In some embodiments, the source code includes custom code elements, and the apparatus further includes an extraction module for:

[0290] For each of the multiple nodes in the target syntax tree, perform the following operations:

[0291] If a node represents a custom code element, then the first character format of the custom code element is converted into the second character format corresponding to the element type of the custom code element. Based on the syntactic meaning and the second character format of the custom code element, the syntactic function of the custom code element is recorded.

[0292] In some embodiments, the apparatus further includes a generation module for:

[0293] After obtaining the memory vulnerability detection results of at least one node connection path in each target node set, a detection report is generated based on at least one memory vulnerability detection result.

[0294] For ease of description, the above sections are divided into modules (or units) according to their functions and described separately. Of course, in implementing this application, the functions of each module (or unit) can be implemented in one or more software or hardware components.

[0295] In this application embodiment, the terms "module" or "unit" refer to a computer program or part of a computer program that has a predetermined function and works with other related parts to achieve a predetermined goal, and can be implemented wholly or partially using software, hardware (such as processing circuitry or memory), or a combination thereof. Similarly, a processor (or multiple processors or memory) can be used to implement one or more modules or units. Furthermore, each module or unit can be part of an overall module or unit that includes the functionality of that module or unit.

[0296] Having introduced the memory vulnerability detection method and apparatus according to exemplary embodiments of this application, we will now introduce an electronic device according to another exemplary embodiment of this application.

[0297] Based on the same inventive concept as the above-described method embodiments, this application also provides an electronic device. In one embodiment, the electronic device may be a server, such as... Figure 1 The server 120 is shown. In this embodiment, the electronic device can be structured as follows: Figure 11 As shown, it includes a memory 1101, a communication module 1103, and one or more processors 1102.

[0298] The memory 1101 is used to store computer programs executed by the processor 1102. The memory 1101 may mainly include a program storage area and a data storage area. The program storage area may store the operating system and programs required to run instant messaging functions, etc.; the data storage area may store various instant messaging information and operation instruction sets, etc.

[0299] Memory 1101 may be volatile memory, such as random-access memory (RAM); memory 1101 may also be non-volatile memory, such as read-only memory, flash memory, hard disk drive (HDD), or solid-state drive (SSD); or memory 1101 may be any other medium capable of carrying or storing a desired computer program having the form of instructions or data structures and accessible by a computer, but is not limited thereto. Memory 1101 may be a combination of the above-described memories.

[0300] Processor 1102 may include one or more central processing units (CPUs) or digital processing units, etc. Processor 1102 is used to implement the above-described memory defect detection method when calling computer programs stored in memory 1101.

[0301] The communication module 1103 is used to communicate with terminal devices and other servers.

[0302] This application embodiment does not limit the specific connection medium between the memory 1101, communication module 1103, and processor 1102. This application embodiment... Figure 11 The memory 1101 and the processor 1102 are connected via a bus 1104, and the bus 1104 is in Figure 11 The diagram uses thick lines to describe the connections between other components; these are for illustrative purposes only and should not be considered limiting. Bus 1104 can be divided into address bus, data bus, control bus, etc. For ease of description, Figure 11 It is described using only a thick line, but does not indicate that there is only one bus or one type of bus.

[0303] The memory 1101 stores a computer storage medium, which stores computer-executable instructions. These instructions are used to implement the memory vulnerability detection method of this application embodiment. The processor 1102 is used to execute the aforementioned memory vulnerability detection method, such as... Figure 2 As shown.

[0304] In another embodiment, the electronic device may also be other electronic devices, such as... Figure 1 The terminal device 112 shown. In this embodiment, the electronic device can be structured as follows: Figure 12 As shown, it includes components such as: communication component 1210, memory 1220, display unit 1230, camera 1240, sensor 1250, audio circuit 1260, Bluetooth module 1270, processor 1280, etc.

[0305] The communication component 1210 is used to communicate with the server. In some embodiments, it may include a Circuit-Wireless Fidelity (WiFi) module, which is a short-range wireless transmission technology. Electronic devices can use the WiFi module to help users send and receive information.

[0306] The memory 1220 can be used to store software programs and data. The processor 1280 executes various functions of the terminal device 112 and performs data processing by running the software programs or data stored in the memory 1220. The memory 1220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. The memory 1220 stores an operating system that enables the terminal device 112 to run. In this application, the memory 1220 can store the operating system and various applications, and may also store a computer program that executes the memory vulnerability detection method of the embodiments of this application.

[0307] The display unit 1230 can also be used to display information input by the user or information provided to the user, as well as various menus of the terminal device 112, in a graphical user interface (GUI). Specifically, the display unit 1230 may include a display screen 1232 disposed on the front of the terminal device 112. The display screen 1232 may be configured as a liquid crystal display, a light-emitting diode, or the like. The display unit 1230 can be used to display memory vulnerability detection results, etc., as described in the embodiments of this application.

[0308] The display unit 1230 can also be used to receive input digital or character information and generate signal inputs related to user settings and function control of the terminal device 112. Specifically, the display unit 1230 may include a touch screen 1231 disposed on the front of the terminal device 112, which can collect touch operations of the user on or near it, such as clicking buttons, dragging scroll boxes, etc.

[0309] The touchscreen 1231 can be placed on top of the display screen 1232, or the touchscreen 1231 and the display screen 1232 can be integrated to realize the input and output functions of the terminal device 112. After integration, it can be referred to as a touch display screen. In this application, the display unit 1230 can display the application and the corresponding operation steps.

[0310] Camera 1240 can be used to capture still images, which users can then share via an application. There can be one or multiple cameras 1240. An object is projected onto a photosensitive element through a lens, generating an optical image. This photosensitive element can be a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the light signal into an electrical signal, which is then transmitted to the processor 1280 for conversion into a digital image signal.

[0311] The terminal device may also include at least one sensor 1250, such as an accelerometer 1251, a proximity sensor 1252, a fingerprint sensor 1253, and a temperature sensor 1254. The terminal device may also be equipped with other sensors such as a gyroscope, barometer, hygrometer, thermometer, infrared sensor, light sensor, and motion sensor.

[0312] Audio circuitry 1260, speaker 1261, and microphone 1262 provide an audio interface between the user and terminal device 112. Audio circuitry 1260 converts received audio data into electrical signals, which are then transmitted to speaker 1261, where they are converted into sound signals for output. Terminal device 112 may also be equipped with volume buttons for adjusting the volume of the sound signal. Conversely, microphone 1262 converts collected sound signals into electrical signals, which are then received by audio circuitry 1260, converted back into audio data, and output to communication component 1210 for transmission to, for example, another terminal device 112, or to memory 1220 for further processing.

[0313] Bluetooth module 1270 is used to interact with other Bluetooth devices that also have Bluetooth modules via the Bluetooth protocol. For example, a terminal device can establish a Bluetooth connection with a wearable electronic device (such as a smartwatch) that also has a Bluetooth module through Bluetooth module 1270, thereby exchanging data.

[0314] The processor 1280 is the control center of the terminal device, connecting various parts of the terminal through various interfaces and lines. It executes various functions and processes data by running or executing software programs stored in the memory 1220 and calling data stored in the memory 1220. In some embodiments, the processor 1280 may include one or more processing units; the processor 1280 may also integrate an application processor and a baseband processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the baseband processor mainly handles wireless communication. It is understood that the baseband processor may not be integrated into the processor 1280. In this application, the processor 1280 can run the operating system, applications, user interface display and touch response, as well as the memory vulnerability detection method of this embodiment. Furthermore, the processor 1280 is coupled to the display unit 1230.

[0315] In some possible implementations, various aspects of the memory vulnerability detection method provided in this application can also be implemented in the form of a computer program product, which includes a computer program that, when run on an electronic device, causes the electronic device to perform the steps of the memory vulnerability detection method according to the various exemplary embodiments of this application described above. For example, the electronic device can perform actions such as... Figure 2 The steps are shown in the figure.

[0316] Computer program products may employ any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of readable storage media include: electrical connections having one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0317] The program product of the embodiments of this application may employ a portable compact disc read-only memory (CD-ROM) and include a computer program, and may run on an electronic device. However, the program product of this application is not limited thereto. In this document, the readable storage medium may be any tangible medium that contains or stores a program that may be used by or in conjunction with a command execution system, apparatus, or device.

[0318] A readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying a readable computer program. This propagated data signal may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A readable signal medium may also be any readable medium other than a readable storage medium, capable of sending, propagating, or transmitting a program for use by or in conjunction with a command execution system, apparatus, or device.

[0319] Computer programs contained on readable media may be transmitted using any suitable medium, including but not limited to wireless, wired, optical fiber, RF, etc., or any suitable combination thereof.

[0320] Computer programs for performing the operations of this application can be written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Java and C++, and conventional procedural programming languages ​​such as C or similar languages. The computer program can execute entirely on the user's electronic device, partially on the user's electronic device, as a standalone software package, partially on the user's electronic device and partially on a remote electronic device, or entirely on a remote electronic device or server. In cases involving remote electronic devices, the remote electronic device can be connected to the user's electronic device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external electronic device (e.g., via the Internet using an Internet service provider).

[0321] It should be noted that although several units or sub-units of the device have been mentioned in the detailed description above, this division is merely exemplary and not mandatory. In fact, according to embodiments of this application, the features and functions of two or more units described above can be embodied in one unit. Conversely, the features and functions of one unit described above can be further divided and embodied by multiple units.

[0322] Furthermore, although the operations of the method of this application are described in a specific order in the accompanying drawings, this does not require or imply that these operations must be performed in that specific order, or that all the operations shown must be performed to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step, and / or one step may be broken down into multiple steps.

[0323] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing a computer-usable computer program.

[0324] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, produce a machine for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0325] These computer program commands may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the commands stored in the computer-readable storage medium produce an article of manufacture including command means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0326] These computer program commands can also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing the commands executed on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0327] Although preferred embodiments of this application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application.

[0328] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.< / string> < / string> < / string>

Claims

1. A memory vulnerability detection method, characterized in that, The method includes: The source code of the target program is converted into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntactic functions of the corresponding code element. Traverse the target syntax tree; wherein, for each node traversed, when a node meets the set conditions based on the corresponding node attributes, search for at least one other node in the target syntax tree that has a connection relationship with the node, and obtain a set of nodes containing the connection relationship; From the obtained multiple sets of nodes, at least one set of target nodes is selected, and each set of target nodes contains at least one target node representing the execution of memory operations; For each target node in each set of target nodes, perform the following operations: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtain the corresponding memory vulnerability detection results.

2. The method according to claim 1, characterized in that, When a node is determined to meet a set condition based on its corresponding node attributes, the process involves searching the target syntax tree for at least one other node that has a connection relationship with the node, thereby obtaining a set of nodes containing the connection relationship, including: When a node is determined to be a node that performs memory operations based on its corresponding node attributes, at least one other node with a connection relationship to the node is searched in the target syntax tree to obtain a set of nodes containing the connection relationship. When a node is determined to represent a key code element of a given syntax structure based on its corresponding node attributes, at least one other node contained in the given syntax structure is searched from the target syntax tree to obtain the corresponding node set.

3. The method according to claim 2, characterized in that, The specified syntax structure includes any of the following: Determine the grammatical structure; Choose the syntax structure; Loop syntax structure.

4. The method according to claim 2, characterized in that, After searching the target syntax tree for at least one other node that has a connection relationship with the given node to obtain a set of nodes containing the connection relationship, the method further includes: Record the subset of target nodes contained in each node set; The step of selecting at least one target node set from the obtained multiple node sets includes: For each of the multiple node sets, the following operations are performed: if a node set contains a non-empty subset of target nodes, then that node set is determined to be the target node set.

5. The method according to claim 1, characterized in that, The step of analyzing the node attributes of each node in the node connection path to which the target node belongs, and obtaining the memory state information of the memory region operated by the target node, includes: Obtain the node connection path to which the target node belongs from the set of target nodes to which the target node belongs; Based on the node attributes of each node in the node connection path, determine the pointer variable information of the memory region to be operated on by the target node, and the pointer variable release information of the memory region; Based on the pointer variable information pointing to the memory region and the pointer variable release information of the memory region, the memory state information of the memory region operated by the target node is obtained.

6. The method according to any one of claims 1 to 5, characterized in that, The step of performing memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtaining the corresponding memory vulnerability detection results, includes: Based on the memory state information, it was detected that after allocating a pointer variable in the memory region operated on by the target node, no memory release operation was performed on the pointer variable. Therefore, it was determined that the code segment corresponding to the node connection path had a memory allocation failure vulnerability. A first error message for the memory allocation failure vulnerability is generated. Based on the first error message and the location information of the code segment corresponding to the node connection path in the source code, a memory vulnerability detection result is obtained.

7. The method according to any one of claims 1 to 5, characterized in that, The step of performing memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtaining the corresponding memory vulnerability detection results, includes: Based on the memory state information, after it is detected that a pointer variable is allocated to the memory region operated on by the target node, there is no pointer variable pointing to the memory region, and it is determined that the code segment corresponding to the node connection path has a pointer relocation vulnerability. A second error message for the pointer relocation vulnerability is generated. Based on the second error message and the location information of the code segment corresponding to the node connection path in the source code, a memory vulnerability detection result is obtained.

8. The method according to any one of claims 1 to 5, characterized in that, The step of performing memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtaining the corresponding memory vulnerability detection results, includes: Based on the memory state information, after detecting that a pointer variable is allocated in the memory region operated on by the target node, multiple memory release operations are performed on the pointer variable, and it is determined that there is a memory double release vulnerability in the code segment corresponding to the node connection path. A third error message corresponding to the memory double-release vulnerability is generated. Based on the third error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

9. The method according to any one of claims 1 to 5, characterized in that, The step of performing memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtaining the corresponding memory vulnerability detection results, includes: Based on the memory state information, it is detected that a pointer variable is allocated in the memory region that operates on the target node, and after performing a memory release operation on the pointer variable, a call operation is continued to be performed on the pointer variable. It is determined that there is a vulnerability in the code segment corresponding to the node connection path where the pointer is not set to null after memory release. A fourth error message corresponding to the vulnerability of not setting the pointer to null after memory release is generated. Based on the fourth error message and the location information of the code segment corresponding to the node connection path in the source code, the memory vulnerability detection result is obtained.

10. The method according to any one of claims 1 to 5, characterized in that, The step of performing memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtaining the corresponding memory vulnerability detection results, includes: Based on the memory state information, it was detected that multiple pointer variables were allocated in the memory region operated on by the target node. After performing a memory release operation on one of the multiple pointer variables, a call operation was performed on another of the multiple pointer variables. It was determined that there was a vulnerability in the code segment corresponding to the node connection path where multiple pointers pointed to the same memory region. A fifth error message is generated indicating a vulnerability where multiple pointers point to the same memory region. Based on the fifth error message and the location information of the pointer variable in the source code, a memory vulnerability detection result is obtained.

11. The method according to any one of claims 1 to 5, characterized in that, The source code includes custom code elements, and after converting the source code of the target program into a target syntax tree, the process further includes: For each of the multiple nodes in the target syntax tree, perform the following operations: If a node represents a custom code element, then the first character format of the custom code element is converted into the second character format corresponding to the element type of the custom code element. Based on the syntactic meaning of the custom code element and the second character format, the syntactic function of the custom code element is recorded.

12. The method according to any one of claims 1 to 5, characterized in that, The method further includes: After obtaining the memory vulnerability detection results of at least one node connection path in each target node set, a detection report is generated based on at least one memory vulnerability detection result.

13. A memory vulnerability detection device, characterized in that, The device includes: A conversion module is used to convert the source code of a target program into a target syntax tree; each node in the target syntax tree represents a code element in the source code, and each node records node attributes containing the syntactic functions of the corresponding code element. The traversal module is used to traverse the target syntax tree; wherein, for each node traversed, when it is determined that the node meets the set conditions according to the corresponding node attributes, at least one other node with a connection relationship with the node is searched from the target syntax tree to obtain a set of nodes containing the connection relationship. The filtering module is used to filter out at least one target node set from multiple obtained node sets, each target node set containing at least one target node representing the execution of memory operations; The detection module is used to perform the following operations for each target node in each target node set: analyze the node attributes of each node in the node connection path to which the target node belongs, obtain the memory state information of the memory region operated by the target node, and perform memory vulnerability detection on the node connection path based on the memory state information and the set vulnerability detection rules, and obtain the corresponding memory vulnerability detection results.

14. An electronic device, characterized in that, It includes a processor and a memory, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the steps of any of the methods described in claims 1 to 12.

15. A computer-readable storage medium, characterized in that, It includes a computer program that, when run on an electronic device, causes the electronic device to perform the steps of any of the methods described in claims 1 to 12.