A multi-level feature enhanced smart contract vulnerability detection method and system
By constructing an abstract syntax tree for smart contracts through a multi-level feature enhancement method, analyzing the multi-dimensional semantic information of the contracts, and combining static taint analysis and neural networks, the problems of false negative and false positive rates in smart contract detection in existing technologies are solved, achieving more efficient vulnerability detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG UNIV
- Filing Date
- 2023-03-28
- Publication Date
- 2026-06-16
AI Technical Summary
Existing static analysis techniques for smart contracts suffer from incomplete semantic information modeling and incomplete code feature extraction, resulting in high false negative and false positive rates and failing to effectively detect smart contract vulnerabilities.
A multi-level feature enhancement method is adopted. By constructing an abstract syntax tree of smart contracts, the external package call relationship, function call and dependency relationship and program information flow within functions are analyzed. Static taint analysis and graph feature network are used to extract features, and a feedforward neural network is combined for vulnerability detection.
It achieves more accurate and comprehensive smart contract vulnerability detection, improves detection efficiency and accuracy, and can effectively identify potential security vulnerabilities in contracts.
Smart Images

Figure CN116361808B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of blockchain smart contract security, specifically to a multi-layered feature-enhanced smart contract vulnerability detection method and system. Background Technology
[0002] Smart contracts, often referred to as the execution engine of decentralized applications (DPAs), have driven the rapid development of DPAs. Essentially a piece of program code, a smart contract, like a traditional computer program, is susceptible to security vulnerabilities. However, unlike traditional programs, once deployed on a blockchain, smart contracts cannot be updated or modified, making losses from vulnerabilities irreversible. Ethereum was the first blockchain platform to introduce smart contracts. Smart contract security vulnerabilities not only caused significant economic losses for Ethereum but also posed substantial security risks to the entire blockchain ecosystem's trusted transaction environment, hindering blockchain development.
[0003] Smart contract code vulnerabilities have become a major issue in blockchain security. Current static analysis techniques for smart contracts mainly rely on methods such as static taint analysis, formal verification, and symbolic execution. These methods suffer from drawbacks such as incomplete modeling of contract semantic information, incomplete extraction of code features, and time-consuming contract analysis, leading to high false positive and false negative rates and failing to achieve satisfactory vulnerability detection results in practical applications. Therefore, it is necessary to research a reliable smart contract vulnerability detection method and system to help developers promptly discover potential security vulnerabilities in contracts, further improving the reliability and security of smart contracts deployed on the blockchain.
[0004] This invention proposes a multi-level feature enhancement method that can effectively detect smart contract vulnerabilities. By converting smart contract code into an Abstract Syntax Tree (AST), feature enhancement is performed on the AST at different levels, analyzing potential vulnerabilities in the contract code from multiple perspectives and angles, thereby improving the scope and accuracy of smart contract vulnerability detection. Summary of the Invention
[0005] To address the problems of incomplete semantic information modeling and incomplete code feature extraction in current static analysis methods for smart contracts, this invention proposes a multi-level feature-enhanced smart contract vulnerability detection method. By integrating multi-dimensional contract semantic information such as external package call relationships, inter-function call and dependency relationships, and intra-function program information flow, this method accurately models contract code features, thereby achieving more accurate anomaly analysis and vulnerability detection.
[0006] To achieve the above objectives, the present invention adopts the following technical solution:
[0007] The first aspect is a multi-layered feature-enhanced smart contract vulnerability detection method, comprising the following steps:
[0008] Step 1: Given a smart contract to be tested, take the contract source code as input and construct the abstract syntax tree (AST) of the smart contract through lexical and syntactic analysis.
[0009] Step 2: Analyze the contract's external package call relationship: Traverse the contract's AST, record all state variables in the contract that have data flow dependency and control flow dependency with the return value of the external package call, use static taint analysis technology to trace the propagation path of the return value, generate a directed graph G1 of the contract's external package call relationship, and use a graph feature network based on time-series information transmission to extract the feature X1 of G1;
[0010] Step 3: Analyze inter-function calls and dependencies: Based on the contract AST, starting from the constructor, search and traverse each node and edge of the AST, analyze the inter-function calls and state variable data dependencies within the contract, cut out nodes and edges with redundant information, generate a directed graph of inter-function dependencies G2, and perform information enhancement on the nodes and edges in G2 respectively. Use a graph feature network based on temporal information transmission to extract features X2 of G2.
[0011] Step 4: Analyze the program information flow within the function: The contract AST is split into sub-ASTs at the function level, and the information flow of nodes and edges is enhanced for each sub-AST. Furthermore, all the sub-ASTs after information enhancement are concatenated, and the feature X3 of the concatenated contract AST is extracted using a graph feature network based on temporal information transmission.
[0012] Step 5: Fuse the features X1, X2, and X3 extracted in steps 2 to 4 respectively, and input the fused feature X into a feedforward neural network multi-layer perceptron (MLP) to output the final predicted contract code anomaly score.
[0013] Secondly, the present invention also provides a system for a multi-level feature-enhanced smart contract vulnerability detection method, for implementing the above-mentioned smart contract vulnerability detection method.
[0014] This invention provides a multi-layered feature enhancement method and system for smart contract vulnerability detection, offering a new approach to smart contract code anomaly analysis and vulnerability detection. Compared to traditional methods, this multi-layered feature enhancement and fusion approach incorporates richer contract semantic information, achieving more accurate and comprehensive detection results. The specific beneficial technical effects and innovations are mainly reflected in the following three aspects:
[0015] (1) This invention proposes a smart contract anomaly analysis method based on multi-source information fusion, which analyzes the semantic features of the contract from multiple dimensions, including external package call relationship, function call and dependency relationship, and program information flow within the function. It comprehensively models the code features of smart contracts, enabling more effective code anomaly analysis and vulnerability detection of smart contracts.
[0016] (2) This invention proposes a static taint analysis method based on smart contracts, which can track the propagation path of state variables, thereby accurately determining whether the key transfer functions and related condition statements of the contract are tainted.
[0017] (3) This invention proposes an information flow enhancement method based on the abstract syntax tree of smart contracts. According to different analysis perspectives, the abstract syntax tree of the contract is pruned and information enhanced accordingly, so that the abstract syntax tree is more in line with the analysis requirements under different perspectives, redundant information is removed, important information flow is enhanced, and detection efficiency and detection accuracy are improved. Attached Figure Description
[0018] Figure 1 This is a schematic diagram of the process of the smart contract vulnerability detection method with multi-level feature enhancement of the present invention.
[0019] Figure 2 This is an architecture diagram of the smart contract vulnerability detection system with multi-level feature enhancement according to the present invention.
[0020] Figure 3 The source code for the Bank smart contract and the source code for the externally packaged timestamp contract are specific embodiments of the present invention.
[0021] Figure 4 This is the abstract syntax tree extracted from the Bank smart contract of this invention.
[0022] Figure 5 This is a directed graph showing the external package call relationships of the smart contract Bank in this invention.
[0023] Figure 6 This is a directed graph showing the calls and dependencies between the Bank functions of the smart contract in this invention.
[0024] Figure 7 This is a schematic diagram illustrating the information flow enhancement based on the abstract syntax tree of this invention. Detailed Implementation
[0025] To clearly illustrate the present invention and make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, so that those skilled in the art can implement them based on the description. The technology of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
[0026] In one specific embodiment of the present invention, a smart contract vulnerability detection method with multi-level feature enhancement includes the following steps:
[0027] Step 1: Given a smart contract to be tested, take the contract source code as input and construct the abstract syntax tree (AST) of the smart contract through lexical and syntactic analysis.
[0028] Step 2: Analyze the contract's external package call relationship: Traverse the contract's AST, record all state variables in the contract that have data flow dependency and control flow dependency with the return value of the external package call, use static taint analysis technology to trace the propagation path of the return value, generate a directed graph G1 of the contract's external package call relationship, and use a graph feature network based on time-series information transmission to extract the feature X1 of G1;
[0029] Step 3: Analyze inter-function calls and dependencies: Based on the contract AST, starting from the constructor, search and traverse each node and edge of the AST, analyze the inter-function calls and state variable data dependencies within the contract, cut out nodes and edges with redundant information, generate a directed graph of inter-function dependencies G2, and perform information enhancement on the nodes and edges in G2 respectively. Use a graph feature network based on temporal information transmission to extract features X2 of G2.
[0030] Step 4: Analyze the program information flow within the function: The contract AST is split into sub-ASTs at the function level, and the information flow of nodes and edges is enhanced for each sub-AST. Furthermore, all the sub-ASTs after information enhancement are concatenated, and the feature X3 of the concatenated contract AST is extracted using a graph feature network based on temporal information transmission.
[0031] Step 5: Fuse the features X1, X2, and X3 extracted in steps 2 to 4 respectively, and input the fused feature X into a feedforward neural network multi-layer perceptron (MLP) to output the final predicted contract code anomaly score.
[0032] In this embodiment, the contract external package call relationship analysis process in step 2 is specifically as follows:
[0033] 2.1) Traverse the contract AST, extract the return value of the external package call, and record all state variables in the contract that have data dependencies and control dependencies with the return value;
[0034] 2.2) Mark the return values of external package calls and state variables that have data dependencies and control dependencies on the return values as taint sources. Use static taint analysis techniques to trace the propagation path of taint sources and perform taint convergence point checks in the following three cases:
[0035] (a) Whether the address parameters of the contract transfer function are affected by taint sources;
[0036] (b) Whether the transfer amount parameter of the contract transfer function is affected by taint sources;
[0037] (c) Whether the conditional statements on the contract execution path are affected by taint sources;
[0038] 2.3) Using the taint convergence point as the termination point, generate a directed graph G1 of the contract external package call relationship. Aggregate the edge features to their parent nodes, delete the edge features, and splice the child node features and the aggregated edge features on the parent nodes. Using a graph feature network based on time-series information transmission, extract the external package call relationship graph feature X1.
[0039] In this embodiment, the function call and dependency analysis process in step 3 is specifically as follows:
[0040] 3.1) Starting from the constructor, traverse each edge of the contract AST in a depth-first manner, cut the original contract AST, remove irrelevant external package calls and local variables, and keep only function nodes, state variable nodes, and key internal contract calls. The internal contract calls include block state calls and built-in transfer function calls.
[0041] 3.2) Analyze whether there are function call relationships in the contract, analyze the read and write dependencies of state variables in the functions, and further generate a directed graph of function dependencies G2;
[0042] 3.3) For G2, information augmentation is performed on nodes and edges respectively: function nodes store function parameters, function return types, calling functions, and called functions; state variable nodes store data dependencies and control dependencies; edges are divided into three categories: read operation edges, write operation edges, and sequential edges; read operation edges indicate that function nodes perform read operations on state variable nodes, while write operation edges indicate that function nodes perform write operations on state variables, and sequential edges indicate the execution order between function nodes; the features of the edges are aggregated to their parent nodes, the edge features are deleted, and the node features and aggregated edge features are concatenated on the parent nodes. Using a graph feature network based on temporal information transmission, the directed graph features X2 of inter-function dependencies are extracted.
[0043] In this embodiment, the specific process of analyzing the information flow of program statements within the function in step 4 is as follows:
[0044] 4.1) The contract AST is split into sub-ASTs at the function level, namely ST1, ST2, ..., ST. N Where N represents the number of functions; the ST at the function level iThe nodes in (i∈N) are classified into core nodes, control flow nodes, and ordinary nodes. Among them, the state variables of the contract, built-in function calls, and block state calls are regarded as core nodes, conditional statements and loop statements are marked as control nodes, and other basic block nodes are marked as ordinary nodes;
[0045] 4.2) Perform information flow enhancement on ST i including adding control flow edges for if, while, and for:
[0046] (a) The if nodes in ST i include an if condition node, a True block node when the condition is true, and a False block node when the condition is false; The specific information flow enhancement includes: adding two control flow edges, one edge from the if condition node to the True statement edge of the True block node, and one edge from the if condition node to the False statement edge of the False block node;
[0047] (b) The while and for nodes in ST i include two child nodes: a condition node and a body node; The specific information flow enhancement includes: adding two control flow edges, one edge connecting from the condition node to the body node, and one edge connecting from the body node to the condition node, so as to simulate the execution process of while and for loops;
[0048] (c) Add directed edges in sequence for the execution order between statements in ST i that is, in ST i add a directed edge in sequence Next Statement between the root of each statement-level subtree and the root of the next same-level statement-level subtree, indicating the execution order of statements within the function;
[0049] 4.3) Concatenate all STs with enhanced node and edge information i aggregate the edge features onto their parent nodes, delete the edge features, concatenate the node features and the aggregated edge features on the parent nodes, and use a graph feature network based on temporal information transfer to extract the features X3 of the entire contract AST after concatenation.
[0050] In this embodiment, the specific implementation method of fusing multi-source information features of the contract and outputting the code anomaly score in step 5 is as follows:
[0051] 5.1) Assign corresponding weights W1, W2, W3 to the three features extracted in the previous steps, where W1 + W2 + W3 = 1 and W1 < W2 < W3, and obtain the fused feature value X = W1×X1 + W2×X2 + W3×X3;
[0052] 5.2) Input the fused feature value X into the feedforward neural network multilayer perceptron (MLP) and output the predicted contract anomaly feature score S. Set a threshold N as the judgment standard. If S≥N, the contract is abnormal; otherwise, there is no anomaly.
[0053] This embodiment also provides a system for a multi-level feature-enhanced smart contract vulnerability detection method, including...
[0054] The Abstract Syntax Tree (AST) construction module is used to perform lexical and syntactic analysis on the source code of smart contracts and construct the AST of smart contracts from the bottom up.
[0055] The feature extraction module is used to perform feature analysis on the Abstract Syntax Tree (AST). It splits and enhances the information of three levels: external package calls, inter-function calls and dependencies, and program statements within functions. It calls a graph feature network based on temporal information transmission to extract features from the three different levels of the AST.
[0056] The feature fusion output module is used to fuse the three different levels of features obtained from the feature extraction module, input the fused features into the multilayer perceptron (MLP), and output the final predicted contract code anomaly score.
[0057] like Figure 2 As shown, in this embodiment, the feature extraction module includes:
[0058] The external package call feature extraction unit uses static taint analysis technology to trace the propagation path of values related to external packages, generates a directed graph of contract external package call relationships, and extracts external package call relationship features;
[0059] The function call feature extraction unit is used to generate a directed graph of inter-function dependencies based on inter-function calls and state variable data dependencies within the contract, and to perform information augmentation on nodes and edges to extract inter-function dependency features.
[0060] The program statement feature extraction unit is used to split the AST into sub-ASTs based on functions, enhance the information flow of nodes and edges in each sub-AST, concatenate all the enhanced sub-ASTs, and extract the features of the concatenated entire contract AST.
[0061] like Figure 1 As shown, the smart contract vulnerability detection process based on multi-level feature enhancement of this invention can be divided into the following four parts: 1) Analyzing the external package call relationship of the contract, generating a directed graph of external package call relationship, and extracting graph features; 2) Analyzing the call and dependency relationship between functions, generating a directed graph of dependency relationship between functions, and extracting graph features; 3) Enhancing the information flow of the AST and extracting contract AST features; 4) Outputting the contract anomaly score based on feature fusion. Figure 2As shown, the smart contract vulnerability detection system based on multi-level feature enhancement of the present invention includes an abstract syntax tree construction module, a feature extraction module, and a feature fusion output module.
[0062] Specifically, with Figure 3 Taking the bank contract shown as an example, Figure 4 The Abstract Syntax Tree (AST) extracted from the bank contract mainly includes the contract's importDirective, FunctionDeclaration, and StateVairableDeclaration. Using the contract AST as input, the specific implementation process for smart contract code anomaly analysis based on multi-source information fusion is as follows:
[0063] Part 1: Analysis of external package call relationships based on contract AST.
[0064] The AST of the bank contract is traversed to extract the external package call information, i.e., the ImportDirective, which reveals that the external package is named timestamp.sol. Further analysis shows that the bank contract calls the getTime function of the external contract Timeset, i.e., time.getTime(). Next, to record all state variables in the contract that have data dependencies on the return values of the external package calls, a taint source label FLAG is designed for time. Static taint analysis techniques are used to track the propagation path of taint sources and perform taint convergence point checks.
[0065] (a) Whether the address parameters of the contract transfer function are affected by taint sources;
[0066] (b) Whether the transfer amount parameter of the contract transfer function is affected by taint sources;
[0067] (c) Whether the conditional statements on the contract execution path are affected by taint sources.
[0068] Specifically, Figure 3 In line 18 of the `bank` contract, the return value of `time.getTime()` is assigned to the state variable `blocktime`. Furthermore, `blocktime` serves as the condition for the critical conditional statement in line 21 of the `guess()` function, i.e., `require(guess_time == blocktime)`, which satisfies (c) of the aforementioned taint convergence point check. Therefore, using this conditional statement as the taint convergence point, a directed graph G1 of external package call relationships is constructed, as follows: Figure 5As shown. The external package call declarations (ImportDirective, FunctionDeclaration, and StateVairableDeclaration) are extracted as graph nodes. Edges are divided into three categories: forward edges, data flow edges, and control flow edges. Forward edges represent the sequential execution between nodes; data flow edges represent data dependencies between variables, such as variable assignment: `blocktime = time.getTime()`; and control flow edges represent conditional constraints, such as `require(guess_time == blocktime)`. Next, the edge features are aggregated onto their parent nodes, edge features are deleted, and node features and aggregated edge features are concatenated on the parent nodes. Using a graph feature network based on temporal information transmission, graph features X1 of the directed graph G1 of external package call relationships are extracted.
[0069] Part Two: Analysis of Function Calls and Dependencies Based on AST.
[0070] by Figure 4 Taking the AST of the bank contract as an example, starting from the constructor, the depth-first method is used to traverse each node and edge of the contract AST. The original AST is cut to remove irrelevant external package calls and local variables, and only the function nodes profit(), setTime(), guess(), withdraw(), and state variable nodes balance, level, blocktime and other key information are retained.
[0071] Next, based on the cut AST, we analyze whether there are function call relationships in the contract. For example, if function node F1 calls function node F2, we construct a call relationship edge from F1 to F2. Furthermore, we analyze the read-write dependencies of state variables within the functions. A specific example is provided. Figure 6As shown: the function node `setTime()` performs a write operation on the state variable node `blocktime`, and the function node `guess()` performs a read operation on the state variable node `blocktime`. This indicates that the `blocktime` of `guess()` depends on the return result of the write operation of `blocktime` in `setTime()`. Therefore, we first construct an edge from `setTime()` to the write operation of `blocktime`, then construct an edge from `guess()` to the read operation of `blocktime`, and then construct a sequence edge from `setTime()` to `guess()`, generating a complete directed graph G2 of function calls and dependencies. Further, we augment the information of the nodes and edges in G2. Function nodes store function parameters, function return types, calling functions, and called functions; state variable nodes store data dependencies and control dependencies. Edges are divided into three categories: read operation edges, write operation edges, and sequence edges. Read operation edges indicate that a function node performs a read operation on a state variable node, while write operation edges indicate that a function node performs a write operation on a state variable node, and sequence edges indicate the execution order between function nodes.
[0072] Finally, the features of the edges are aggregated to their parent nodes, the edge features are deleted, and the node features and aggregated edge features are concatenated on the parent nodes. Using a graph feature network based on temporal information transmission, the features X2 of the directed graph G2 with inter-function dependencies are extracted.
[0073] Part Three: Analysis of In-Function Program Statements Based on Information Flow Enhancement.
[0074] The contract AST is split into sub-ASTs at the function level, such as ST1, ST2, ..., ST. N Next, information flow enhancement is performed on the sub-AST, such as... Figure 7 As shown, taking the function profit() as an example, firstly, the AST of the function profit() is cut out and denoted as ST1. Then, node information is further enhanced on it, with nodes divided into core nodes, control nodes, and ordinary nodes. Among these, the contract's state variables (such as...) are... Figure 7 (balance, level) and built-in function calls (such as...) Figure 6 Key information such as msg.sender, block state calls (such as block.number, block.timestamp, block.blockhash) are marked as core nodes. Conditional statements and loop statements such as assert{X}, require{X}, if{...}else{X}, if{...}then{X}, while{X}do{...}, for{X}do{...} are marked as control nodes. Other basic block nodes are marked as ordinary nodes.
[0075] Next, perform side information enhancement on ST1, adding control flow edges to if statements, while statements, and for statements respectively, and adding sequential edges to sequentially executed statements. As Figure 7 shown, add two control flow edges to the if node, one edge from the if condition node to the True statemen edge of the True block node, and one edge from the if condition node to the False statemen edge of the False block node; further, add a sequential directed edge Next Statement between the root of each statement-level subtree in ST1 and the root of the next sibling statement-level subtree to represent the sequential execution of statements within the function;
[0076] Finally, splice all the sub-ASTs with enhanced node and edge information, aggregate the edge features onto their parent nodes, delete the edge features, splice the node features and the aggregated edge features on the parent nodes, and use a graph feature network based on temporal information transmission to extract the features X3 of G3.
[0077] Part Four, Abnormal Analysis of Contract Codes Based on Feature Fusion.
[0078] Assign corresponding weights W1, W2, and W3 to the three features extracted in the foregoing steps, where W1 + W2 + W3 = 1 and W1 < W2 < W3, to obtain the fused feature value X = W1 × X1 + W2 × X2 + W3 × X3. W1, W2, and W3 respectively represent the proportions of the three features in the prediction of contract code anomalies; then, input the fused feature value X into the multi-layer perceptron MLP of the feedforward neural network to output the predicted contract anomaly feature score S, and set a threshold N as the judgment criterion. If S ≥ N, the contract has an anomaly; otherwise, there is no anomaly.
[0079] The above description of the embodiments is to facilitate the understanding and application of the present invention by those of ordinary skill in the art. Those skilled in the art can obviously make various modifications to the above embodiments easily and apply the general principles described herein to other embodiments without creative efforts. Therefore, the present invention is not limited to the above embodiments, and the improvements and modifications made by those skilled in the art to the present invention should be within the protection scope of the present invention according to the disclosure of the present invention.
Claims
1. A multi-layered feature-enhanced smart contract vulnerability detection method, characterized in that, Includes the following steps: Step 1: Given a smart contract to be tested, take the contract source code as input and construct the abstract syntax tree (AST) of the smart contract through lexical and syntactic analysis. Step 2: Analyze the contract's external package call relationship: Traverse the contract's AST, record all state variables in the contract that have data flow dependency and control flow dependency with the return value of the external package call, use static taint analysis technology to trace the propagation path of the return value, generate a directed graph G1 of the contract's external package call relationship, and use a graph feature network based on time-series information transmission to extract the feature X1 of G1; Step 3: Analyze inter-function calls and dependencies: Based on the contract AST, starting from the constructor, search and traverse each node and edge of the AST, analyze the inter-function calls and state variable data dependencies within the contract, cut out nodes and edges with redundant information, generate a directed graph of inter-function dependencies G2, and perform information enhancement on the nodes and edges in G2 respectively. Use a graph feature network based on temporal information transmission to extract features X2 of G2. Step 4: Analyze the program information flow within the function: The contract AST is split into sub-ASTs at the function level, and the information flow of nodes and edges is enhanced for each sub-AST. Furthermore, all the sub-ASTs after information enhancement are concatenated, and the feature X3 of the concatenated contract AST is extracted using a graph feature network based on temporal information transmission. The specific process of analyzing the information flow of program statements within the function in step 4 is as follows: 4-1) Decompose the contract AST into sub-ASTs at the function level, i.e., ST1, ST2, ..., ST N , where N represents the number of functions; ST at the function level i The nodes in (i∈N) are categorized into core nodes, control flow nodes, and ordinary nodes. Among them, the contract's state variables, built-in function calls, and block state calls are designated as core nodes, conditional statements and loop statements are designated as control nodes, and other basic block nodes are designated as ordinary nodes. 4-2) For ST i Enhance the information flow, including adding control flow edges to if, while, and for loops: (a)ST i The if node includes the if condition node, the True block node when it is true, and the False block node when it is false; the specific information flow enhancement includes: adding two control flow edges, one edge from the if condition node to the True statement of the True block node, and one edge from the if condition node to the False statement of the False block node. (b)ST i The while and for nodes in the middle include two child nodes: the condition node and the main node; the specific information flow enhancement includes: adding two control flow edges, one edge connecting the condition node to the main node, and the other edge connecting the main node to the condition node, thereby simulating the execution process of while and for loops; (c) is ST i The execution order between statements is increased by a directed edge, i.e., ST. i Add a directed edge Next Statement between the root of each statement-level subtree and the root of the next same-level statement-level subtree, indicating the execution order of statements within the function; 4-3) Concatenate all STs that pass through nodes and edge information to enhance them. i The edge features are aggregated to their parent node, the edge features are deleted, the node features and the aggregated edge features are concatenated on the parent node, and the features X3 of the concatenated contract AST are extracted using a graph feature network based on time-series information transmission. Step 5: Fuse the features X1, X2, and X3 extracted in steps 2-4 respectively, and input the fused feature X into a feedforward neural network multi-layer perceptron (MLP) to output the final predicted contract code anomaly score; The specific implementation method for fusing the multi-source information features of the contract and outputting the code anomaly score in step 5 is as follows: 5-1) Assign corresponding weights W1, W2, and W3 to the three features extracted in the previous steps, where W1 + W2 + W3 = 1 and W1 < W2 < W3, and obtain the fused feature value X = W1 X1 + W2 X2 + W3 X3; 5-2) Input the fused feature value X into the feedforward neural network multilayer perceptron (MLP) and output the predicted contract anomaly feature score S. Set a threshold N as the judgment standard. If S≥N, the contract is abnormal; otherwise, there is no anomaly.
2. The smart contract vulnerability detection method according to claim 1, characterized in that, The specific process of analyzing the external package call relationship in step 2 is as follows: 2-1) Traverse the contract AST, extract the return value of the external package call, and record all state variables in the contract that have data dependency and control dependency with the return value; 2-2) Mark the return values of external package calls and state variables that have data dependencies and control dependencies on the return values as taint sources. Use static taint analysis techniques to trace the propagation path of taint sources and perform taint convergence point checks in the following three cases: (a) Whether the address parameter of the contract transfer function is affected by taint sources; (b) Whether the transfer amount parameter of the contract transfer function is affected by taint sources; (c) Whether the conditional statements on the contract execution path are affected by taint sources; 2-3) Using the taint convergence point as the termination point, generate a directed graph G1 of the contract external package call relationship. Aggregate the edge features to their parent nodes, delete the edge features, and splice the child node features and the aggregated edge features on the parent nodes. Using a graph feature network based on time-series information transmission, extract the external package call relationship graph feature X1.
3. The smart contract vulnerability detection method according to claim 1, characterized in that, The function call and dependency analysis process in step 3 is as follows: 3-1) Starting from the constructor, traverse each edge of the contract AST in a depth-first manner, cut the original contract AST, remove irrelevant external package calls and local variables, and keep only function nodes, state variable nodes, and key internal contract calls. The internal contract calls include block state calls and built-in transfer function calls. 3-2) Analyze whether there are function call relationships in the contract, analyze the read and write dependencies of state variables in the functions, and further generate a directed graph of function dependencies G2; 3-3) For G2, information augmentation is performed on nodes and edges respectively: function nodes store function parameters, function return types, calling functions, and called functions; state variable nodes store data dependencies and control dependencies; edges are divided into three categories: read operation edges, write operation edges, and sequential edges; read operation edges indicate that function nodes perform read operations on state variable nodes, while write operation edges indicate that function nodes perform write operations on state variables, and sequential edges indicate the execution order between function nodes; the features of the edges are aggregated to their parent nodes, the edge features are deleted, and the node features and aggregated edge features are concatenated on the parent nodes. Using a graph feature network based on temporal information transmission, the directed graph features X2 of inter-function dependencies are extracted.
4. A system for a smart contract vulnerability detection method with multi-level feature enhancement as described in claim 1, characterized in that, include The Abstract Syntax Tree (AST) construction module is used to perform lexical and syntactic analysis on the source code of smart contracts and construct the AST of smart contracts from the bottom up. The feature extraction module is used to perform feature analysis on the Abstract Syntax Tree (AST). It splits and enhances the information of three levels: external package calls, inter-function calls and dependencies, and program statements within functions. It calls a graph feature network based on temporal information transmission to extract features from the three different levels of the AST. The feature fusion output module is used to fuse the three different levels of features obtained from the feature extraction module, input the fused features into the multilayer perceptron (MLP), and output the final predicted contract code anomaly score.
5. The system according to claim 4, characterized in that, The feature extraction module includes: The external package call feature extraction unit uses static taint analysis technology to trace the propagation path of values related to external packages, generates a directed graph of contract external package call relationships, and extracts external package call relationship features; The function call feature extraction unit is used to generate a directed graph of inter-function dependencies based on inter-function calls and state variable data dependencies within the contract, and to perform information augmentation on nodes and edges to extract inter-function dependency features. The program statement feature extraction unit is used to split the AST into sub-ASTs based on functions, enhance the information flow of nodes and edges in each sub-AST, concatenate all the enhanced sub-ASTs, and extract the features of the concatenated entire contract AST.