Method, device, equipment and medium for gate level failure rate prediction based on graph neural network

By abstracting the gate-level netlist into a directed acyclic graph and utilizing the embedding representation, sorting, and labeling modules of graph neural networks, the problem of balancing evaluation efficiency and accurate identification in existing technologies is solved. This enables fast and accurate risk gate identification and quantification, guiding selective hardening and improving the reliability and economy of integrated circuit design.

CN122241428APending Publication Date: 2026-06-19BEIJING UNIV OF POSTS & TELECOMM +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING UNIV OF POSTS & TELECOMM
Filing Date
2026-03-10
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to accurately identify and quantify key risk gates while maintaining rapid assessment efficiency, making it impossible to provide accurate decision-making basis for selective hardening.

Method used

A graph neural network-based approach is adopted to abstract the gate-level netlist to be evaluated into a directed acyclic graph, construct gate-level feature representations of nodes, and achieve rapid identification and accurate quantification of high-risk gates through the decoupled design of embedding representation module, sorting module and calibration module.

Benefits of technology

It enables rapid and accurate identification and quantification of key risk gates, provides a basis for selective hardening decisions, reduces computational overhead, and improves the reliability and economy of integrated circuit design.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241428A_ABST
    Figure CN122241428A_ABST
Patent Text Reader

Abstract

This application provides a method, apparatus, device, and medium for predicting gate-level failure rates based on graph neural networks. The method includes: acquiring a directed acyclic graph (DAG) of a gate-level netlist to be evaluated, and constructing gate-level feature representations for each node in the DAG; processing the DAG and the gate-level feature representations of each node in the DAG through a pre-trained embedding representation module in a graph neural network for gate-level failure rate prediction to obtain the embedding representations of each node in the DAG; processing the embedding representations of each node in the DAG through a ranking module to obtain a risk ranking score for each node; selecting the top k high-risk gates or high-risk triggers based on the risk ranking scores of each node; and processing the embedding representations of the top k nodes corresponding to the top k high-risk gates or high-risk triggers through a calibration module to obtain the gate-level failure rate of each high-risk gate or high-risk trigger.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of gate-level failure rate prediction technology, and in particular to a method, apparatus, device and medium for gate-level failure rate prediction based on graph neural networks. Background Technology

[0002] As CMOS (Complementary Metal-Oxide-Semiconductor) technology enters the deep submicron stage, soft errors caused by high-energy particle bombardment in integrated circuits have become a critical factor affecting functional safety in fields such as automotive electronics and aerospace. Gate-level Failure-In-Time (FIT) is used to quantify the contribution of each logic gate or flip-flop in a circuit to the overall system failure risk. Under stringent standards such as ISO 26262, it is necessary to quickly and accurately locate a few high-risk gates during the design phase and selectively harden them to balance reliability, area, power consumption, and timing overhead.

[0003] The relevant technologies mainly fall into two categories: the first is physical simulation methods (such as BFIT), which obtain high-precision gate-level FIT through detailed modeling and numerical integration, but the computational complexity is high and the speed is slow, making it difficult to support rapid iterative evaluation and optimization of large-scale circuits; the second is deep learning-based methods (such as DeepGate2), which abstract the gate-level netlist into a graph structure and use graph neural networks for regression prediction. Although the efficiency is relatively high, the model is prone to "prediction collapse" due to the extreme long-tail distribution of gate-level FIT, that is, it is difficult to accurately distinguish and sort a few key high-risk gates, resulting in unreliable sorting, low hardening efficiency, and inability to effectively guide low-overhead reliability design.

[0004] Therefore, the relevant technologies struggle to accurately identify and quantify key risk gates while maintaining rapid assessment efficiency, resulting in an inability to provide a decision-making basis for selective reinforcement that is both fast and accurate in prioritization. Summary of the Invention

[0005] The purpose of this application is to provide a method, apparatus, device, and medium for predicting gate-level failure rates based on graph neural networks. This can solve the technical problem in related technologies where it is difficult to balance assessment efficiency and prediction accuracy, so as to achieve rapid and accurate identification and quantification of key risk gates, thereby providing a decision-making basis for efficient and reliable selective hardening.

[0006] To solve the above-mentioned technical problems, this application is implemented as follows: A first aspect of this application discloses a method for predicting gate-level failure rates based on graph neural networks, the method comprising: Obtain the directed acyclic graph of the gate-level netlist to be evaluated, and construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated; The embedded representation module in the pre-trained graph neural network for gate-level failure rate prediction is used to process the directed acyclic graph to be evaluated and the gate-level feature representation of each node in the directed acyclic graph to be evaluated, so as to obtain the embedded representation of each node in the directed acyclic graph to be evaluated. The ranking module in the graph neural network used for gate-level failure rate prediction processes the embedded representation of each node in the directed acyclic graph to be evaluated, and obtains the risk ranking score of each node in the directed acyclic graph to be evaluated. The higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or flip-flop corresponding to that node. Based on the risk ranking score of each node in the directed acyclic graph to be evaluated, the top k high-risk gates or high-risk triggers are selected from each logic gate or trigger corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero. The calibration module in the graph neural network used for gate-level failure rate prediction processes the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each of the first k high-risk gates or high-risk triggers.

[0007] Optionally, the directed acyclic graph to be evaluated and the gate-level feature representations of each node in the directed acyclic graph to be evaluated are processed to obtain the embedding representations of each node in the directed acyclic graph to be evaluated, including: Based on the directed edges from the first node to the second node in the directed acyclic graph to be evaluated, the predecessor node of the second node in the forward propagation phase is determined to be the first node, and the predecessor node of the second node in the backward propagation phase is determined to be the first node. For each node in the directed acyclic graph to be evaluated, with the goal of simulating the forward propagation process of a fault from a logic gate to each flip-flop, the embedding representation of the node is initially updated based on the embedding representation of the predecessor node of the node during the forward propagation phase. For each node in the directed acyclic graph to be evaluated, with the goal of simulating the backpropagation process of a fault self-triggered trigger back to each logic gate, the embedding representation of the node is updated again based on the initial updated embedding representation of the predecessor node of the node in the backpropagation stage, so as to obtain the embedding representation of the node.

[0008] Optionally, the embedding representation of the node is initially updated based on the embedding representation of the node's predecessor during the forward propagation phase, including: Based on the embedding representations of the node's multiple predecessor nodes during the forward propagation phase, determine the attention weights of the node's multiple predecessor nodes during the forward propagation phase. Based on the embedding representation of the node, the embedding representations of the multiple predecessor nodes of the node in the forward propagation stage are weighted and accumulated according to their respective attention weights to obtain the initial updated embedding representation of the node. Based on the initial updated embedding representation of the node's predecessor node during the backpropagation phase, the initial updated embedding representation of the node is updated again to obtain the node's embedding representation, including: Based on the initial updated embedding representations of the node's multiple predecessor nodes during the backpropagation phase, determine the attention weights of the node's multiple predecessor nodes during the backpropagation phase. Based on the initial updated embedding representation of the node, the initial updated embedding representations of the multiple predecessor nodes of the node in the backpropagation phase are weighted and accumulated according to their respective attention weights in the backpropagation phase to obtain the embedding representation of the node.

[0009] Optionally, a gate-level feature representation is constructed for each node in the directed acyclic graph to be evaluated, including: For each node in the directed acyclic graph to be evaluated, the type feature representation of the logic gate or flip-flop corresponding to the node is determined according to the type of the node; For each node in the directed acyclic graph to be evaluated, the topological feature representation of the logic gate or flip-flop corresponding to the node is determined based on the topological information of the node. For each node in the directed acyclic graph to be evaluated, the structural feature representation of the corresponding logic gate or flip-flop is determined based on the structural information of the node. For each node in the directed acyclic graph to be evaluated, any update to obtain the embedding representation of that node is performed according to the following steps: For each node in the directed acyclic graph to be evaluated, if the node is of type NOT gate, the first updater in the embedded representation module is used to perform any update, and the update strategy of the first updater is adapted to the propagation characteristics of the inverse logic. For each node in the directed acyclic graph to be evaluated, if the node is of type AND-LIKE gate, an update is performed by the second updater in the embedded representation module. The update strategy of the second updater is adapted to the propagation characteristics of non-inverting logic.

[0010] Optionally, the method further includes: Obtain the first sample directed acyclic graph of the first sample gate-level netlist, and construct gate-level feature representations for each node in the first sample directed acyclic graph; Based on the gate-level failure rate labels of each logic gate or flip-flop corresponding to each node in the first sample directed acyclic graph, a sorting label is generated for each pair of nodes in the first sample directed acyclic graph. By using the embedding representation module in the graph neural network to be trained, the first sample directed acyclic graph and the gate-level feature representation of each node in the first sample directed acyclic graph are processed to obtain the embedding representation of each node in the first sample directed acyclic graph. The embedding representation of each pair of nodes in the directed acyclic graph of the first sample is processed by the ranking module in the graph neural network to be trained, and the ranking prediction result of the pair of nodes is obtained. Based on the ranking prediction result of the pair of nodes and the ranking label of the pair of nodes, the pair ranking loss value is determined. Based on the pairwise ranking loss value, the model parameters of the embedding representation module and the ranking module to be trained are updated to obtain the trained embedding representation module and the trained ranking module.

[0011] Optionally, after obtaining the trained embedding representation module and the trained sorting module, the method further includes: Obtain the second sample directed acyclic graph of the second sample gate-level netlist, and construct gate-level feature representations for each node in the second sample directed acyclic graph; The trained embedding representation module processes the second sample directed acyclic graph and the gate-level feature representations of each node in the second sample directed acyclic graph to obtain the embedding representations of each node in the second sample directed acyclic graph. The trained ranking module processes the embedding representation of each node in the second sample's directed acyclic graph to obtain the risk ranking score of each node in the second sample's directed acyclic graph. Based on the risk ranking scores of each node in the second sample directed acyclic graph, select the top k high-risk gates or high-risk triggers from each logic gate or flip-flop corresponding to each node in the second sample directed acyclic graph. The calibration module in the graph neural network to be trained for gate-level failure rate prediction processes the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers. Based on the gate-level failure rate of each of the first k high-risk gates or high-risk triggers and the gate-level failure rate labels of the first k high-risk gates or high-risk triggers in the second sample directed acyclic graph, the model parameters of the trained embedding representation module and the calibration module to be trained are updated to obtain the trained embedding representation module and the trained calibration module. The pre-trained graph neural network for gate-level failure rate prediction includes: a trained embedding representation module, a trained sorting module, and a trained calibration module.

[0012] Optionally, after obtaining the gate-level failure rate of each of the first k high-risk gates or high-risk triggers, the method further includes: According to the gate-level failure rate from high to low, the first k high-risk gates or high-risk triggers are reinforced one by one. After each hardening operation is performed on a high-risk gate or high-risk trigger, it is checked whether the total failure rate of the circuit to be evaluated corresponding to the gate-level netlist to be evaluated is lower than the target threshold. If the total failure rate of the circuit under evaluation is not lower than the target threshold, continue to strengthen the next high-risk gate or high-risk trigger one by one until the total failure rate of the circuit under evaluation is lower than the target threshold.

[0013] A second aspect of this application discloses an apparatus for predicting gate-level failure rates based on graph neural networks, the apparatus comprising: The feature extraction module is used to obtain the directed acyclic graph to be evaluated of the gate-level netlist to be evaluated, and to construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated. The embedding representation module is used to process the directed acyclic graph to be evaluated and the gate-level feature representation of each node in the directed acyclic graph to be evaluated, so as to obtain the embedding representation of each node in the directed acyclic graph to be evaluated. The sorting module is used to process the embedded representation of each node in the directed acyclic graph to be evaluated to obtain the risk ranking score of each node in the directed acyclic graph to be evaluated; the higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or flip-flop corresponding to the node; and, based on the risk ranking score of each node in the directed acyclic graph to be evaluated, to select the top k high-risk gates or high-risk flip-flops from each logic gate or flip-flop corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero; The calibration module is used to process the embedded representation of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers.

[0014] A third aspect of this application discloses an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the method for predicting gate-level failure rates based on graph neural networks described in the first aspect of this application.

[0015] A fourth aspect of this application discloses a readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the method for predicting gate-level failure rates based on graph neural networks as described in the first aspect of this application.

[0016] A fifth aspect of this application discloses a computer program product, including a computer program that, when executed by a processor, implements the steps of the method for predicting gate-level failure rates based on graph neural networks as described in the first aspect of this application.

[0017] The embodiments of this application have the following advantages: In this embodiment, by abstracting the gate-level netlist to be evaluated into a directed acyclic graph and constructing the gate-level feature representation of the nodes, the topological and electrical properties of the circuit itself can be fully utilized for characterization, providing a structured input basis for subsequent reliability analysis based on graph neural networks, and enhancing the physical interpretability and information integrity of the feature representation.

[0018] By employing an embedded representation module in a pre-trained graph neural network for gate-level failure rate prediction, the gate-level feature representations of each node in the directed acyclic graph to be evaluated and the directed acyclic graph to be evaluated are processed. This enables automated and efficient modeling of complex dependencies and propagation effects between gate-level units, avoiding the tedious manual modeling and numerical integration process in traditional physical simulation, improving evaluation efficiency, and ensuring the model's ability to capture the functional and structural semantics of the circuit.

[0019] Furthermore, by decoupling the sorting and calibration modules, the risk ranking scores of each node in the directed acyclic graph to be evaluated are first obtained to identify a few key high-risk units. Then, the high-risk gates or high-risk triggers are finely calibrated for failure rates. This not only alleviates the model training and prediction bias caused by the extreme long-tail distribution of gate-level failure rates, but also reduces unnecessary computational overhead. This achieves accurate identification and quantification of key risk gates while maintaining the overall evaluation speed.

[0020] The gate-level failure rate of each high-risk gate or high-risk trigger output by this method can be directly used to guide selective hardening decisions. Under the premise of meeting the target reliability index, it can reduce the number of units that need to be hardened, thereby saving area, power consumption and timing overhead, and improving the reliability and economy of integrated circuit design. Attached Figure Description

[0021] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0022] Figure 1 This is a flowchart illustrating the steps of a method for predicting gate-level failure rates based on graph neural networks, as provided in an embodiment of this application. Figure 2 This is an overall architecture diagram of a gate-level failure rate prediction method based on graph neural networks provided in an embodiment of this application; Figure 3 This is a flowchart of the training process of a graph neural network for gate-level failure rate prediction provided in an embodiment of this application. Figure 4 This is an overall flowchart of a method for predicting gate-level failure rate based on graph neural networks provided in an embodiment of this application; Figure 5 This is a schematic diagram of the structure of a device for predicting gate-level failure rate based on graph neural networks provided in an embodiment of this application; Figure 6 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0023] To make the above-mentioned objectives, features, and advantages of this application more apparent and understandable, the technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0024] To assess gate-level failure rate (FIT) to guide reliability design, relevant technologies mainly fall into two categories: physical simulation and deep learning. Physical simulation methods (such as BFIT) perform detailed modeling of the charge and timing of particle bombardment, simulate pulse propagation and shielding effects within the circuit topology, and finally calculate FIT through numerical integration. While this method can achieve high-precision evaluation results, its calculation process involves multiple nested path enumerations and numerical integrations, resulting in a time complexity that increases non-linearly with circuit size. This makes it difficult to achieve second-level evaluations on large-scale gate-level netlists, thus failing to support frequent reliability iterations and optimizations in the design process.

[0025] Deep learning methods (such as DeepGate2) abstract gate-level netlists into graph structures and utilize graph neural networks (GNNs) for end-to-end regression prediction, significantly improving evaluation efficiency. However, gate-level FIT data generally exhibits an extreme long-tail distribution, meaning that the vast majority of gates have extremely low risk, with only a few key gates contributing the main risk. Traditional regression training objectives are easily dominated by a large number of low-risk samples, leading to systematic bias and "prediction collapse" in the model's predictions of high-risk gates. This makes it difficult to generate reliable risk rankings, resulting in inefficient and costly subsequent hardening decisions.

[0026] In summary, the relevant technologies struggle to strike a good balance between assessment efficiency and the accuracy of key risk identification, limiting the practicality and cost-effectiveness of selective hardening technologies.

[0027] To overcome the limitations of related technologies, this application proposes a method, apparatus, device, and medium for gate-level failure rate prediction based on graph neural networks. The technical concept combines the modeling idea of ​​fault propagation processes (forward propagation and backward backtracking) in physical simulation with the powerful graph structure learning capabilities of graph neural networks. By designing a bidirectional message passing mechanism that follows the circuit topology order, the GNN can learn the complex semantics of gate-level fault propagation. Furthermore, a decoupled prediction framework of "sorting-calibration" is adopted: First, for the critical gate screening requirement under long-tail distribution, the sorting module learns to perform relative risk ranking of all gates to stably and accurately identify the top k high-risk gates or high-risk triggers; then, only for the selected high-risk gates or high-risk triggers, the calibration module is called to perform refined fault rate numerical regression, thereby obtaining accurate FIT values ​​that can be used for quantitative assessment and hardening decision-making while controlling computational overhead. This concept effectively integrates the interpretability priors of physical methods with the efficiency of data-driven methods, simulating the decision-making logic from "identifying key targets" to "precise quantitative assessment" in the overall process, and ultimately achieving fast, accurate, and guideable low-overhead hardened gate-level failure rate prediction.

[0028] Reference Figure 1 As shown, Figure 1 This is a flowchart illustrating the steps of a gate-level failure rate prediction method based on a graph neural network, as provided in an embodiment of this application. Figure 1 As shown, the method for predicting gate-level failure rates based on graph neural networks may include steps S110 to S150: Step S110: Obtain the directed acyclic graph to be evaluated from the gate-level netlist to be evaluated, and construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated.

[0029] The gate-level netlist to be evaluated is a low-level description file of the circuit to be evaluated, consisting of logic gates (such as AND gates, OR gates, NOT gates) or flip-flops and their interconnections; it is usually generated by logic synthesis tools from Register Transfer Level (RTL) designs.

[0030] The directed acyclic graph (DAG) to be evaluated is a circuit diagram data structure obtained by parsing and abstracting the gate-level netlist to be evaluated. In this DAG, a node corresponds to a logic gate or a flip-flop in the gate-level netlist (the circuit to be evaluated). The directed edge from the first node to the second node in the DAG represents either: the first logic gate in the gate-level netlist provides a driving signal for the second logic gate, or a logic gate in the gate-level netlist provides a driving signal for a flip-flop.

[0031] A node's gate-level feature representation includes at least: the type feature representation of the logic gate or flip-flop corresponding to the node, the topological feature representation of the logic gate or flip-flop corresponding to the node, and the structural feature representation of the logic gate or flip-flop corresponding to the node. Furthermore, it may include timing feature representation, electrical feature representation, activity feature representation, and physical implementation feature representation related to failure rate prediction. Specifically, the type feature representation can identify the specific gate type through one-hot encoding; the topological feature representation can be the node's fan-in count (number of inputs), fan-out count (number of subsequent nodes driven by the output), and the node's logic depth in the circuit; the structural feature representation includes other attributes related to the circuit structure; the timing feature representation can be arrival time and time margin; the electrical feature representation can be load and wiring, and drive and cell electrical characteristics; the activity feature representation can be signal probability and vector correlation statistics; and the physical implementation feature representation can be congestion and wiring, and geometry and process proxies.

[0032] Specifically, after obtaining the gate-level netlist to be evaluated, it is parsed and abstracted into a circuit graph data structure, namely the directed acyclic graph (DAG) to be evaluated. Subsequently, a gate-level feature representation is constructed for each node in the DAG. This transforms the physical and logical properties of the circuit into a numerical representation that can be processed by graph neural networks, providing information-rich and structured input for subsequent deep feature learning.

[0033] Step S120: The directed acyclic graph to be evaluated and the gate-level feature representations of each node in the directed acyclic graph to be evaluated are processed by the embedding representation module in the graph neural network for gate-level failure rate prediction, so as to obtain the embedding representations of each node in the directed acyclic graph to be evaluated.

[0034] The embedded representation module is a bidirectional asynchronous topological GNN backbone whose message propagation order strictly follows the circuit topology and is aligned with the physical propagation semantics of circuit faults. The process of processing the directed acyclic graph to be evaluated and the gate-level feature representations of each node through the embedded representation module is a bidirectional asynchronous topological message passing process. It learns by simulating the propagation semantics of soft faults in the circuit. On the one hand, it propagates forward along the circuit signal flow direction (i.e., the directed edge direction of the graph) to simulate the process of a single-event transient (SET) pulse spreading from the bombarded logic gate to subsequent flip-flops. On the other hand, it propagates backward along the signal flow direction to simulate the process of tracing the sensitization path back from the flip-flop end to the source, in order to capture the dependency conditions for fault capture.

[0035] Furthermore, in this process, the embedded representation module can use a configurable family of message aggregators to fuse neighbor messages from different fan-in paths and assign differentiated weights to different fan-in paths to simulate the differences in the impact of different fan-in paths on faults in the circuit. The family of message aggregators can be selected or combined by an aggregation function factory, and the selected aggregators can include one of the following: MLPAggr based on a multilayer perceptron, AttnMLP with attention weighting, TFMLP based on a Transformer structure, AGNNConv based on graph attention convolution, AggConv based on graph convolution summation, and DeepSetConv that satisfies set permutation invariance. Among them, aggregators using attention or learnable weighting are used to highlight fan-in paths that are more sensitive to fault propagation, and convolutional or set-based aggregators are used to enhance the characterization of local structural statistics and neighborhood distribution features. Further, for gates with different logical functions, gated recurrent units with differentiated parameters (such as GRU_NOT, GRU_AND) are used to update the node state to more finely characterize the impact of logical functions on fault propagation. Finally, the embedding representation module outputs an embedding representation for each node in the directed acyclic graph to be evaluated. This embedding representation deeply integrates the node's own characteristics and its semantic information in the context of the entire circuit fault propagation.

[0036] Step S130: The embedding representation of each node in the directed acyclic graph to be evaluated is processed by the ranking module in the graph neural network used for gate-level failure rate prediction to obtain the risk ranking score of each node in the directed acyclic graph to be evaluated; the higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or flip-flop corresponding to the node.

[0037] The ranking module can be a ranking network (RankNet) composed of a multilayer perceptron (MLP), and a learnable scaling factor can be introduced to scale the output score (with an upper limit) to reduce score ties and stabilize Top-K selection. The task of the ranking module is not to directly predict the precise FIT value, but to learn to output a risk ranking score for each node.

[0038] The risk ranking score of a node reflects the relative failure risk level of the corresponding logic gate or flip-flop in the directed acyclic graph to be evaluated compared with other nodes. The higher the score, the higher the priority of hardening the corresponding logic gate or flip-flop, that is, the greater the potential contribution of the node to the overall failure probability of the system, and therefore the higher the priority in subsequent selective hardening.

[0039] This step solves the critical gate identification problem under the long-tailed distribution of gate-level FIT. Specifically, the embedding representations of each node in the directed acyclic graph to be evaluated obtained in step S120 are input into the ranking module of the graph neural network used for gate-level failure rate prediction. This ranking module outputs the risk ranking score of each node based on its embedding representation. This design shifts the focus of graph neural network optimization for gate-level failure rate prediction from the difficult-to-fit absolute value regression to the more stable relative order learning. The network is forced to learn to distinguish between high-risk and low-risk gates, avoiding the "prediction collapse" problem (i.e., all predicted values ​​converge to a constant) that occurs when directly regressing long-tailed data.

[0040] Step S140: Based on the risk ranking scores of each node in the directed acyclic graph to be evaluated, select the top k high-risk gates or high-risk triggers from each logic gate or trigger corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero.

[0041] Specifically, based on the risk ranking scores of each node in the directed acyclic graph to be evaluated, all nodes are sorted from highest to lowest risk ranking score. Then, according to a preset parameter k (k can be a fixed number or a percentage of the total number of nodes, such as Top-k%), the top k nodes are selected. The logic gates or flip-flops corresponding to these k nodes are determined to be the high-risk gates or high-risk flip-flops most important to consider in the circuit to be evaluated. In this way, the screening operation in this step focuses subsequent computational resources on the few key units (logic gates or flip-flops) that have the greatest impact on the system reliability.

[0042] Step S150: The calibration module in the graph neural network used for gate-level failure rate prediction processes the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger among the first k high-risk gates or high-risk triggers.

[0043] The calibration module is typically a calibration network (CalibNet) composed of another multilayer perceptron (MLP). Specifically, the embedded representations of the top k nodes selected in step S140 are input into the calibration module in the graph neural network used for gate-level failure rate prediction. This module outputs a physically meaningful gate-level failure rate prediction value (e.g., in FIT, i.e., the number of device failures per billion hours) corresponding to each high-risk gate or high-risk trigger.

[0044] Since the calibration module only needs to process a small number of key nodes after sorting and filtering, it can use a more complex structure or a targeted loss function (such as a dual-domain loss combining linear and logarithmic domains) to ensure accurate prediction of FIT values ​​that may span multiple orders of magnitude, without incurring unnecessary computational overhead or accuracy loss due to processing a massive number of low-risk nodes.

[0045] The technical solution adopted in this application abstracts the gate-level netlist to be evaluated into a directed acyclic graph and constructs gate-level feature representations of the nodes. This fully utilizes the circuit's own topological and electrical properties for characterization, providing a structured input foundation for subsequent reliability analysis based on graph neural networks, and enhancing the physical interpretability and information integrity of the feature representation. By employing the embedded representation module in the graph neural network used for gate-level failure rate prediction, the gate-level feature representations of each node in the directed acyclic graph to be evaluated and the directed acyclic graph to be evaluated are processed. This achieves automated and efficient modeling of complex dependencies and propagation effects between gate-level units, avoiding the tedious manual modeling and numerical integration process in traditional physical simulation, improving evaluation efficiency, and ensuring the model's ability to capture the functional and structural semantics of the circuit. Furthermore, by decoupling the sorting and calibration modules, the risk ranking scores of each node in the directed acyclic graph to be evaluated are first obtained to identify a few key high-risk units. Then, the selected high-risk gates or high-risk triggers are subjected to refined failure rate calibration. This not only alleviates the model training and prediction bias caused by the extreme long-tail distribution of gate-level failure rates, but also reduces unnecessary computational overhead, achieving accurate identification and quantification of key risk gates while maintaining overall evaluation speed. The gate-level failure rate of each high-risk gate or high-risk trigger output by this method can be directly used to guide selective hardening decisions. Under the premise of meeting the target reliability indicators, it can reduce the number of units that need to be hardened, thereby saving area, power consumption, and timing overhead, and improving the reliability and economy of integrated circuit design.

[0046] In an optional embodiment, after obtaining the gate-level failure rate of each of the first k high-risk gates or high-risk triggers, the method further includes steps S160 to S180: Step S160: According to the gate-level failure rate from high to low, perform hardening operations on the first k high-risk gates or high-risk triggers one by one.

[0047] Hardening refers to applying specific fault-tolerant designs to high-risk gates or triggers to significantly reduce their soft-error sensitivity. Common measures include triple modular redundancy (TMR), which uses three identical units to perform computations in parallel and outputs the results via a voter; hardened latches (such as the DICE structure); and inserting error detection and correction logic (such as parity checks / check triggers) into the critical path. Each hardening effort aims to eliminate or significantly reduce the contribution of that specific unit to the overall system failure rate.

[0048] Step S170: After performing a hardening operation for each high-risk gate or high-risk trigger, check whether the total failure rate of the circuit to be evaluated corresponding to the gate-level netlist to be evaluated is lower than the target threshold.

[0049] The overall failure rate can be detected based on the updated circuit netlist. Specifically, hardened high-risk gates or high-risk triggers are removed from the prediction list or their contribution is set to zero. Then, the predicted failure rates of the remaining high-risk gates or high-risk triggers are re-accumulated to obtain the hardened overall failure rate. Finally, this overall failure rate is compared with a preset target threshold, which is derived from the functional safety requirements of the specific application scenario. For example, for ASIL-D level components conforming to the ISO 26262 standard, the target threshold may be FIT<10 (i.e., the device fails less than 10 times per billion hours).

[0050] Step S180: If the total failure rate of the circuit to be evaluated is not lower than the target threshold, continue to strengthen the next high-risk gate or high-risk trigger one by one until the total failure rate of the circuit to be evaluated is lower than the target threshold.

[0051] If the total failure rate is not lower than the target threshold, it indicates that the current hardening is insufficient. The process then returns to the loop in step S160, continuing the hardening operation on the next high-risk gate or high-risk trigger in a predetermined order, before re-entering step S170 for evaluation. This hardening-evaluation loop continues until an evaluation detects that the total failure rate of the circuit is lower than the target threshold. At this point, the loop terminates immediately, and the entire hardening process is complete.

[0052] The technical solution adopted in this application, based on the accurate and quantitative risk ranking and prediction provided by graph neural networks, ensures that the hardening operation strictly follows the principle of prioritizing the highest risk, guaranteeing that each hardening investment achieves the maximum reliability benefit (i.e., the highest risk reduction), thus solving the problem of random hardening or over-hardening caused by inaccurate ranking in related methods. Furthermore, the automated "prediction-screening-hardening-verification" cycle replaces the traditional manual iterative mode that relies on engineer experience and repeated time-consuming simulations, improving the efficiency and operability of reliability design optimization. In addition, the entire process terminates with the achievement of objective and quantitative target thresholds, providing a clear and reliable decision-making basis for the design and ensuring that the design meets stringent industry safety standards.

[0053] like Figure 2 As shown, Figure 2 This is an overall architecture diagram of a gate-level failure rate prediction method based on a graph neural network provided in this application embodiment. The method, based on a graph neural network (including an embedding representation module, a ranking module, and a calibration model) for gate-level failure rate prediction, achieves a complete closed loop from gate-level netlist parsing to hardening decision-making. The specific implementation process is as follows: First, the gate-level netlist to be evaluated (containing combinational logic gates and sequential units) is parsed and abstracted into a directed acyclic graph to be evaluated. ,node Represents a logic gate or a flip-flop, edge It represents the signal-driven relationship; it constructs gate-level feature representations for each node in the directed acyclic graph to be evaluated, so as to encode the logic, structure and topology attributes of the circuit into a numerical representation that can be processed by machine learning, and provide complete input information for subsequent graph learning.

[0054] Secondly, the embedding representation module processes the gate-level feature representations of each node in the directed acyclic graph to be evaluated and the directed acyclic graph to be evaluated based on the bidirectional asynchronous topology message passing process, so as to obtain the embedding representation of each node in the directed acyclic graph to be evaluated.

[0055] Then, the embedding representation of each node in the directed acyclic graph to be evaluated is processed by the sorting module to obtain the risk ranking score of each node in the directed acyclic graph to be evaluated.

[0056] Next, the top k high-risk gates or high-risk triggers are selected based on the risk ranking scores of each node. The embedded representations of the top k nodes corresponding to the top k high-risk gates or high-risk triggers are processed by the calibration module to obtain the gate-level failure rate of each high-risk gate or high-risk trigger among the top k high-risk gates or high-risk triggers.

[0057] Finally, according to the gate-level failure rate from high to low, the first k high-risk gates or high-risk triggers are hardened one by one. After each high-risk gate or high-risk trigger is hardened, it is checked whether the total failure rate of the circuit to be evaluated corresponding to the gate-level netlist is lower than the target threshold. If the total failure rate of the circuit to be evaluated is not lower than the target threshold, the next high-risk gate or high-risk trigger is hardened one by one until the total failure rate of the circuit to be evaluated is lower than the target threshold.

[0058] Thus, the method implemented in this paper achieves a complete design closed loop of "gate-level risk ranking - quantifiable hardening decision - effect prediction", which can effectively guide precise hardening with the goal of minimizing area, power consumption and timing overhead while meeting the target failure rate threshold.

[0059] In an optional embodiment, step S110 above, "constructing gate-level feature representations for each node in the directed acyclic graph to be evaluated," may include steps S110-1 to S110-3: Step S110-1: For each node in the directed acyclic graph to be evaluated, determine the type feature representation of the logic gate or flip-flop corresponding to the node according to the type of the node.

[0060] This step is used to encode the basic functional identity of the nodes. For each node in the directed acyclic graph to be evaluated, a type feature representation is generated based on the specific type of its corresponding circuit unit (e.g., a two-input NAND gate, an inverter, or a D flip-flop). This is typically achieved through one-hot encoding or similar methods, forming a type feature representation of the logic gate or flip-flop corresponding to the node. This representation indicates the node's basic logic or storage function, providing the model with the most fundamental functional category information.

[0061] Step S110-2: For each node in the directed acyclic graph to be evaluated, determine the topological feature representation of the logic gate or flip-flop corresponding to the node based on the topological information of the node.

[0062] This step is used to capture the location and connectivity attributes of nodes within the global circuit structure. Based on the node's topology information, a topology feature representation is calculated and generated. Topology feature representations typically include, but are not limited to: the node's fan-in number (the number of inputs), fan-out number (the number of successor nodes driven by the output), and logic depth (the number of gate levels on the longest path from the original input to the node). In addition, it may include the node's relative position in the topology sequence, whether it is located in a feedback loop, and its position in the clock domain or hierarchical design. These features characterize the node's coordinates and connectivity within the signal flow network, which is crucial for assessing the extent and likelihood of fault propagation.

[0063] Step S110-3: For each node in the directed acyclic graph to be evaluated, determine the structural feature representation of the logic gate or flip-flop corresponding to the node based on the structural information of the node.

[0064] This step incorporates other physical or design attributes relevant to circuit performance or reliability analysis. Based on the node's structural information, structural feature representations are extracted. These representations may include cell drive strength, load capacitance, cell area, equivalent output resistance and input pin capacitance, threshold voltage levels, or timing arc characteristics associated with the node. These features introduce more detailed electrical and timing attributes into the model, aiding in the simulation of pulse electrical attenuation (electrical shielding) and timing window effects (time shielding).

[0065] In this way, by integrating three types of features—type, topology, and structure—an information-rich initial representation is constructed for each node, ensuring that the graph neural network can acquire the diverse information foundation required for reliability assessment.

[0066] In an optional embodiment, step S120 above, "processing the directed acyclic graph to be evaluated and the gate-level feature representations of each node in the directed acyclic graph to be evaluated to obtain the embedded representations of each node in the directed acyclic graph to be evaluated," may include steps S120-1 to S120-3: Step S120-1: Based on the directed edges from the first node to the second node in the directed acyclic graph to be evaluated, determine that the predecessor node of the second node is the first node in the forward propagation phase, and determine that the second node is the predecessor node of the first node in the backward propagation phase.

[0067] Specifically, during the forward propagation phase, the normal flow of circuit signals is followed. For a directed edge (representing a signal driving relationship) from the first node (driving unit) to the second node (load unit), the first node is defined as the predecessor node of the second node when simulating fault forward propagation. This establishes the basic path for information to flow from the driving unit to the load unit, corresponding to the forward propagation direction of the transient pulse generated by particle bombardment along the logic depth of the circuit.

[0068] During the backpropagation phase, the direction opposite to the signal flow is followed, i.e., the logic backtracking direction. For the same directed edge, when simulating fault backtracking, the second node is defined as the predecessor node of the first node. This establishes an information feedback path from the subsequent unit to the source unit, corresponding to the analysis direction of the sensitized path backtracking from the trigger (error capture point) to the upstream logic gate that caused the fault.

[0069] Thus, this step transforms the inherent unidirectional signal flow of the circuit into two directed adjacency relationships used in GNN to learn bidirectional dependencies. By defining two different topological orders for message transmission, the foundation is laid for the subsequent bidirectional semantics of fault propagation in analog circuits.

[0070] Step S120-2: For each node in the directed acyclic graph to be evaluated, with the goal of simulating the forward propagation process of the fault from the logic gate to each flip-flop, the embedding representation of the node is initially updated based on the embedding representation of the predecessor node of the node in the forward propagation stage.

[0071] This step simulates the propagation of fault pulses from their source into the depths of the circuit (especially timing units). Specifically, for each node in the directed acyclic graph to be evaluated, based on its topological order during the forward propagation phase (i.e., the order in which all input signals have been processed), the embedded representations of all its predecessor nodes are aggregated, and the node's embedded representation is updated by combining it with the node's embedded representation. This updated node embedded representation includes the fault propagation characteristics from its upstream logic, propagated along the signal flow direction.

[0072] Step S120-3: For each node in the directed acyclic graph to be evaluated, with the goal of simulating the backpropagation process of the fault self-trigger back to each logic gate, the embedded representation of the node is updated again based on the initial updated embedded representation of the predecessor node of the node in the backpropagation stage, so as to obtain the embedded representation of the node.

[0073] This step simulates the process of tracing the scope of a fault's impact backward, starting from the trigger. Specifically, after completing one forward propagation, for each node in the directed acyclic graph to be evaluated, the embedded representations of all its predecessor nodes (nodes pointing to it during this stage) are aggregated according to their topological order in the backward propagation phase, after the initial update. Furthermore, the embedded representation of the node is updated again based on the aggregation result; this update involves enumerating the sensitization path information from the trigger to the node through backward backtracking (e.g., whether the node is located on a sensitization path of a trigger).

[0074] The technical solution of this application structurally encodes the inherent two-stage analysis logic of forward pulse diffusion and backward path tracing in physical tools into the learning framework of a graph neural network. This enables the node embedding representation learned by the model to incorporate key dynamic features of fault propagation and capture in the circuit, enhancing the model's physical interpretability and ability to characterize reliability semantics, thus overcoming the shortcomings of traditional regression-based GNNs in this regard. Furthermore, the asynchronous and topologically ordered update method ensures that each node is updated only after the information of all its predecessor nodes has been calculated. This is completely consistent with the stable logical timing of circuit signals, avoiding information circular dependencies or transmission chaos, and guaranteeing the correctness and efficiency of the calculation process.

[0075] In one alternative embodiment, an aggregation strategy with an attention weighting mechanism is introduced to update the embedded representation of nodes. This strategy enables the graph neural network used for gate-level failure rate prediction to learn and distinguish the relative importance of different input paths to the impact of fault propagation, thereby simulating circuit behavior more precisely.

[0076] Specifically, step S120-2 above, "updating the embedding representation of the node for the first time based on the embedding representation of the predecessor node of the node in the forward propagation phase," may include steps S120-2-1 and S120-2-2: Step S120-2-1: Based on the embedding representations of the multiple predecessor nodes of the node during the forward propagation phase, determine the attention weights of the multiple predecessor nodes of the node during the forward propagation phase.

[0077] For the node to be updated, obtain the embedding representations of all its predecessor nodes during the forward propagation phase. Using an attention-based computation mechanism (the query-key mechanism in the TFMLP aggregator), the embedding representation of the current node (as the query) is interactively computed with the embedding representations of each predecessor node (as the key), dynamically calculating a set of attention weights. These attention weights represent the individual attention weights of the multiple predecessor nodes for this node during the forward propagation phase. Each attention weight quantifies the importance of the failure impact of the corresponding predecessor node at the current node, i.e., quantifies the importance of the embedding representation of the corresponding predecessor node for updating the state of this node.

[0078] Step S120-2-2: Based on the embedding representation of the node, according to the attention weights of the multiple predecessor nodes of the node in the forward propagation stage, the embedding representations of the multiple predecessor nodes of the node in the forward propagation stage are weighted and accumulated to obtain the initial updated embedding representation of the node.

[0079] The embedded representations of each predecessor node are weighted and accumulated according to their attention weights to generate a message vector that aggregates weighted upstream information. This allows for selective information fusion, with predecessor nodes having higher weights contributing more, simulating situations where certain input paths in a circuit play a dominant role in signal (or fault) propagation. Subsequently, this message vector is combined with the node's current embedded representation and input into an update function. The update function calculates the node's initial updated embedded representation based on the aggregated message vector and the node's original embedded representation. This representation now incorporates the attention-weighted fault propagation features transmitted along the signal flow direction.

[0080] Thus, by introducing an aggregation strategy with an attention weight mechanism, the graph neural network used for gate-level failure rate prediction can dynamically and discriminatively integrate information from multiple fan-in paths when simulating fault forward propagation. This more accurately reflects the physical fact that in actual circuits, different inputs have different suppression or enhancement effects on pulse propagation due to differences in logic function, electrical characteristics, and topology. This fine-grained modeling capability improves the discriminativeness of node embedding representations, laying a more reliable representational foundation for subsequent accurate risk ranking.

[0081] Furthermore, step S120-3 above, "updating the initial updated embedding representation of the node again based on the initial updated embedding representation of the node's predecessor node during the backpropagation phase to obtain the node's embedding representation," may include steps S120-3-1 and S120-3-2: Step S120-3-1: Based on the initial updated embedding representations of the multiple predecessor nodes of this node during the backpropagation phase, determine the attention weights of the multiple predecessor nodes of this node during the backpropagation phase.

[0082] For the current node, obtain the initial updated embedding representations of all its predecessor nodes (i.e., its downstream nodes) during the backpropagation phase. Then, using the attention mechanism (query-key mechanism in the TFMLP aggregator), the embedding representation of the current node (as the backtracking source) is interactively computed with the embedding representations of each predecessor node (as the key), dynamically calculating a set of attention weights, i.e., the attention weights of each of the multiple predecessor nodes of this node during the backpropagation phase.

[0083] Each attention weight reflects the relative contribution of the corresponding downstream path to the event of a potential fault being propagated to and captured by the trigger. For example, a path that ends up connecting to a trigger that is highly sensitive to faults may receive a higher attention weight.

[0084] Step S120-3-2: Based on the initial updated embedding representation of the node, according to the attention weights of the multiple predecessor nodes of the node in the backpropagation stage, the initial updated embedding representations of the multiple predecessor nodes of the node in the backpropagation stage are weighted and accumulated to obtain the embedding representation of the node.

[0085] The embedded representations of each downstream node (the predecessor node in the backpropagation stage) are weighted and accumulated according to their corresponding back attention weights to form a message vector that aggregates downstream capture sensitivity information. This message vector gathers weighted information about the probability of fault capture from various downstream directions. Then, this aggregated message vector is combined with the node's initial updated embedded representation (which already contains forward information). The combined information is input into the update function (uploader) to calculate the node's final embedded representation. This final embedded representation contains the potential for fault propagation forward from this point, as well as sensitive information about the ease with which faults at this point can be captured, traced back from its downstream circuits.

[0086] Thus, by introducing attention-weighted aggregation in the backpropagation stage, the graph neural network used for gate-level failure rate prediction can finely simulate the receiver (capture) effect of circuit fault propagation. It enables the final embedded representation of each node to distinguish the relative importance of its different output paths in fault capture. This bidirectional, attention-weighted information fusion mechanism constitutes an efficient data-driven approximation of the path enumeration and aggregation process in traditional physical simulation, providing node features with both physical semantics and discriminative power for subsequent high-precision risk ranking.

[0087] In an optional embodiment, the step S120-2 or S130-3 described above, "for each node in the directed acyclic graph to be evaluated, any update to the embedding representation of that node," is implemented according to the following steps: Step A1: For each node in the directed acyclic graph to be evaluated, if the node is of type NOT gate, perform any update using the first updater in the embedded representation module. The update strategy of the first updater is adapted to the propagation characteristics of the inverse logic.

[0088] Specifically, when the node to be updated is a NOT gate (inverter), a separate, specially configured first updater (e.g., GRU_NOT) with doubled (e.g., 2x) hidden layer dimensions is invoked within the embedded representation module. The network structure and parameters of this updater are specifically designed and trained to enable its update strategy to better capture and learn the unique propagation characteristics of inverted logic. For example, it needs to specifically model the polarity reversal behavior of signals passing through the inverter, which has a significant impact on whether fault pulses can be masked by subsequent logic.

[0089] Step A2: For each node in the directed acyclic graph to be evaluated, if the node is of type AND-LIKE gate, perform any update using the second updater in the embedded representation module. The update strategy of the second updater is adapted to the propagation characteristics of non-inverting logic.

[0090] Specifically, when the node to be updated is an AND-LIKE gate (referring to gates with AND logic characteristics, such as AND, NAND, OR, NOR, etc.), a second updater (such as GRU_AND) is invoked. This updater is shared across all AND gate types, and its update strategy is designed to adapt to the commonalities of propagation in non-inverting logic (or gates with complex Boolean functions). For example, it focuses on modeling the combination and masking effects of multiple input signals, which is crucial for evaluating logic masking.

[0091] Thus, by equipping NOT gates and AND-LIKE gates with different updaters, differentiated modeling of the basic logic propagation characteristics of the circuit is achieved. This design enables neural networks to learn more accurately and efficiently the essential differences in fault propagation behavior between inverted and non-inverted logic, avoiding the problems of feature confusion or insufficient modeling ability that may result from using a single updater.

[0092] In one optional embodiment, a pairwise ranking learning strategy is employed to train the "embedding representation module" and the "ranking module" to optimize the graph neural network used for gate-level failure rate prediction's ability to discriminate the relative order of risk among nodes, thereby effectively addressing the challenges posed by the long-tailed distribution of gate-level failure rates. Specifically, the method includes the following steps B1 to B5: Step B1: Obtain the first sample directed acyclic graph of the first sample gate-level netlist, and construct gate-level feature representations for each node in the first sample directed acyclic graph.

[0093] Specifically, a sample circuit can be taken from the training circuit library, and its corresponding gate-level netlist (the first sample gate-level netlist) can be parsed and abstracted into a first sample directed acyclic graph (DAG). A node in this first sample DAG corresponds to a logic gate or a flip-flop in the first sample gate-level netlist (sample circuit). The directed edges from the first node to the second node in the first sample DAG represent: the first logic gate in the first sample gate-level netlist provides a driving signal for the second logic gate, or a logic gate in the first sample gate-level netlist provides a driving signal for a flip-flop. Furthermore, each node (logic gate or flip-flop) in the first sample DAG has a corresponding gate-level failure rate label as supervisory information; this gate-level failure rate label is usually pre-calculated and generated by physical simulation tools such as BFIT.

[0094] Subsequently, a gate-level feature representation is constructed for each node in the first sample directed acyclic graph. The gate-level feature representation of a node includes at least: the type feature representation of the logic gate or flip-flop corresponding to the node, the topological feature representation of the logic gate or flip-flop corresponding to the node, and the structural feature representation of the logic gate or flip-flop corresponding to the node.

[0095] Step B2: Based on the gate-level failure rate labels of each logic gate or flip-flop corresponding to each node in the first sample directed acyclic graph, generate sorting labels for each pair of nodes in the first sample directed acyclic graph.

[0096] Specifically, for any pair of nodes in the first sample directed acyclic graph Their sorting labels can be based on their gate-level failure rate labels. and The size relationship is determined, and node pairs are sampled only within the same circuit, without cross-circuit node combination. For example, for node i and node j, it can be defined as: if If the sorting label indicates that "node i has a higher risk than node j"; if The ranking label indicates that "node i has a lower risk than node j". Pairs of nodes with equal failure rates can be specially handled or ignored. Furthermore, when calculating pairwise ranking losses, a weighted strategy is used to amplify the difference in failure rates between node pairs; that is, node pairs with greater differences in failure rates have higher weights in the loss calculation, thereby strengthening the model's attention to and learning of significant differences in risk levels. In this way, through this labeling format and weighting mechanism, the model's optimization objective is guided from predicting specific numerical values ​​to judging the relative level of risk.

[0097] Step B3: Process the first sample directed acyclic graph and the gate-level feature representations of each node in the first sample directed acyclic graph using the embedding representation module in the graph neural network to be trained, and obtain the embedding representations of each node in the first sample directed acyclic graph.

[0098] Specifically, the first sample directed acyclic graph and the gate-level feature representations of each node obtained in step B1 are input into the embedding representation module to be trained. This module processes the gate-level feature representations of the first sample directed acyclic graph and each node based on a bidirectional asynchronous topological message passing mechanism, and outputs the embedding representations of each node in the first sample directed acyclic graph.

[0099] Step B4: The embedding representation of each pair of nodes in the directed acyclic graph of the first sample is processed by the ranking module in the graph neural network to be trained to obtain the ranking prediction result of the pair of nodes. Based on the ranking prediction result of the pair of nodes and the ranking label of the pair of nodes, the pair ranking loss value is determined.

[0100] Specifically, each pair of nodes obtained in step B3 The embedding representations of nodes are input into a ranking module to be trained. This module (such as a shallow MLP) processes these embedding representations and outputs a ranking score for each node. This score reflects the relative ranking position of a node among the nodes in the directed acyclic graph. For example, if the ranking score of node i is higher than that of node j, it means that node i is more risky than node j. In this way, the model can learn and predict the relative order of risk among all nodes.

[0101] node pairs The ranking prediction result is compared with the ranking label of the node pair generated in step B2. A pairwise ranking loss value is calculated using a ranking task loss function (e.g., pairwise ranking loss based on cross-entropy, list network loss, etc.) to constrain the relative positional relationship of high-risk gates in the ranking result. This loss value quantifies the degree of error of the graph neural network in predicting the risk order on the current node pair.

[0102] Step B5: Based on the pairwise ranking loss value, update the model parameters of the embedding representation module and the ranking module to be trained to obtain the trained embedding representation module and the trained ranking module.

[0103] Specifically, the pairwise ranking loss value calculated in step B4 is backpropagated, with the gradient flowing simultaneously to both the ranking module and the embedding representation module to be trained. With minimizing the pairwise ranking loss as the optimization objective, the network parameters of these two modules are updated using an optimization algorithm (such as Adam), enabling the node embedding representations and their ranking functions learned by the model to increasingly accurately reflect the relative magnitudes of the true failure rates between nodes.

[0104] By repeating steps B1 to B5, the trained embedding representation module and the trained ranking module are finally obtained. At this point, the ranking module can generate reliable risk ranking scores based on the node embeddings.

[0105] The technical solution adopted in this application introduces specialized pairwise ranking learning training, taking the identification of key high-risk gates as the optimization objective to obtain the correct relative risk ranking, thus ensuring the graph neural network's attention to high-risk gates. Furthermore, by enabling the graph neural network to learn to compare the risk levels of node pairs, the prediction collapse problem caused by the dominance of massive low-risk samples is alleviated. The graph neural network is forced to distinguish between high-risk and low-risk samples, thereby forming more significant discriminative features for key gates in the embedding space.

[0106] In an optional embodiment, based on the graph neural network's already reliable risk ranking capability, the selected key high-risk gates are further subjected to precise numerical regression training to obtain gate-level failure rate prediction values ​​that can be used for quantitative evaluation. Specifically, after obtaining the trained embedding representation module and the trained ranking module, the method further includes the following steps B6 to B11: Step B6: Obtain the second sample directed acyclic graph of the second sample gate-level netlist, and construct gate-level feature representations for each node in the second sample directed acyclic graph.

[0107] Specifically, another sample circuit (or another one from the same batch) is selected from the training circuit library, and its corresponding gate-level netlist (second sample gate-level netlist) is parsed and abstracted into a second sample directed acyclic graph (DAG). A node in this second sample DAG corresponds to a logic gate or a flip-flop in the second sample gate-level netlist. The directed edges from the first node to the second node in the second sample DAG represent: the first logic gate in the second sample gate-level netlist provides the driving signal for the second logic gate, or a logic gate in the second sample gate-level netlist provides the driving signal for a flip-flop. Furthermore, each node (logic gate or flip-flop) in the second sample DAG has a corresponding gate-level failure rate label as supervisory information; this gate-level failure rate label is typically pre-calculated and generated by physical simulation tools such as BFIT.

[0108] Subsequently, a gate-level feature representation is constructed for each node in the second sample directed acyclic graph. The gate-level feature representation of a node includes at least: the type feature representation of the logic gate or flip-flop corresponding to the node, the topological feature representation of the logic gate or flip-flop corresponding to the node, and the structural feature representation of the logic gate or flip-flop corresponding to the node.

[0109] Step B7: Using the trained embedding representation module, process the second sample directed acyclic graph and the gate-level feature representation of each node in the second sample directed acyclic graph to obtain the embedding representation of each node in the second sample directed acyclic graph.

[0110] The parameters of the trained embedding representation module have been initially optimized during the ranking training (steps B1 to B5). The second sample directed acyclic graph and the gate-level feature representations of each node obtained in step B6 are input into the trained embedding representation module. This module processes the gate-level feature representations of the second sample directed acyclic graph and its nodes based on a bidirectional asynchronous topological message passing mechanism, and outputs the embedding representations of each node in the second sample directed acyclic graph.

[0111] Step B8: Using the trained ranking module, process the embedding representation of each node in the second sample directed acyclic graph to obtain the risk ranking score of each node in the second sample directed acyclic graph.

[0112] The trained ranking module can generate reliable risk ranking scores based on node embeddings, providing a relatively reliable risk order. The embedding representations of each node obtained in step B7 are input into the trained ranking module to obtain the risk ranking score for each node.

[0113] Step B9: Based on the risk ranking scores of each node in the second sample directed acyclic graph, select the top k high-risk gates or high-risk triggers from each logic gate or flip-flop corresponding to each node in the second sample directed acyclic graph.

[0114] Specifically, based on the risk ranking scores output in step B8, all nodes in the second sample graph are sorted in descending order. According to a preset threshold k (a fixed number or percentage), the top k nodes are selected. These k nodes are defined as the top k high-risk gates or high-risk triggers in the current sample. In this way, the set of targets for calibration training is determined through this screening operation, allowing training to focus on the parts that contribute the most to the system risk.

[0115] Step B10: The embedding representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers are processed by the calibration module in the graph neural network to be trained for gate-level failure rate prediction, so as to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers.

[0116] Specifically, only the embedding representations corresponding to the top k high-risk nodes selected in step B9 are input into the calibration module to be trained. This calibration module (usually an MLP, i.e., CalibNet) outputs a predicted gate-level failure rate value for each input high-risk node.

[0117] Step B11: Based on the gate-level failure rates of the first k high-risk gates or high-risk triggers and the gate-level failure rate labels of the first k high-risk gates or high-risk triggers in the second sample directed acyclic graph, update the model parameters of the trained embedding representation module and the calibration module to be trained to obtain the trained embedding representation module and the trained calibration module.

[0118] Specifically, the gate-level failure rates predicted by the calibration module for the top k high-risk nodes are compared with the actual gate-level failure rate labels corresponding to these nodes. A loss function suitable for regression tasks (such as a linear-logarithmic dual-domain loss) is used to calculate the loss value. This loss design can simultaneously constrain the absolute accuracy of the numerical value and the consistency across orders of magnitude, making it suitable for handling FIT values ​​spanning multiple orders of magnitude.

[0119] The calculated loss is backpropagated to correct the parameters of the calibration module and the embedding representation module to be trained. Based on the ranking learning, the parameters of the embedding representation module are further fine-tuned from the numerical accuracy target, thereby generating node embeddings that are beneficial to both ranking and accurate regression.

[0120] The pre-trained graph neural network for gate-level failure rate prediction includes: a trained embedding representation module, a trained ranking module, and a trained calibration module. The final pre-trained graph neural network for gate-level failure rate prediction is an ensemble containing three cooperating sub-modules: a trained embedding representation module (whose parameters have been jointly optimized by ranking and calibration losses), a trained ranking module (focusing on relative risk ranking), and a trained calibration module (focusing on numerical regression for high-risk gates).

[0121] The technical solution adopted in this application allows the calibration module, after training, to output calibrated failure rate values ​​close to the accuracy of physical simulation for the selected key high-risk gates, providing a direct basis for reliability quantification assessment and hardening budget calculation. Furthermore, regression training is performed only on a subset of high-risk gates, avoiding the dominance of massive low / zero-risk samples on the loss function and gradient, significantly improving training efficiency and enabling the model to focus on learning the numerical prediction capabilities most important for engineering decisions, resulting in a more stable training process. In addition, the ranking and calibration stages are not isolated; the gradient from calibration training further optimizes the embedding representation module, ensuring that the finally learned node embeddings simultaneously serve both accurate ranking and precise calibration goals.

[0122] like Figure 3 As shown, Figure 3 This is a flowchart illustrating the training process of a graph neural network for gate-level failure rate prediction, as provided in an embodiment of this application. Specifically, the graph neural network for gate-level failure rate prediction includes a trained embedding representation module, a trained sorting module, and a trained calibration module. The training process includes a sorting training process and a calibration training process.

[0123] First, perform the sorting training process (steps 1 to 5): Step 1: Obtain training samples. Specifically, obtain the first sample directed acyclic graph of the first sample gate-level netlist, and construct gate-level feature representations for each node in the first sample directed acyclic graph; based on the gate-level failure rate labels of each logic gate or flip-flop corresponding to each node in the first sample directed acyclic graph, generate sorting labels for each pair of nodes in the first sample directed acyclic graph.

[0124] Step 2: Obtain the graph neural network to be trained, including the embedding representation module, the sorting module, and the calibration module.

[0125] Step 3: Using the embedding representation module to be trained, process the gate-level feature representations of each node in the first sample directed acyclic graph and the first sample directed acyclic graph to obtain the embedding representations of each node in the first sample directed acyclic graph.

[0126] Step 4: Process the embedding representation of each pair of nodes in the directed acyclic graph of the first sample using the ranking module in the graph neural network to be trained, and obtain the ranking prediction result of the pair of nodes. Determine the pair ranking loss value based on the ranking prediction result of the pair of nodes and the ranking label of the pair of nodes.

[0127] Step 5: Update the model parameters of the embedding representation module and the sorting module to be trained based on the pairwise sorting loss value; Repeat steps 3 to 5 above to train the model. Finally, under the condition that training is completed, the trained embedding representation module and the trained sorting module are obtained.

[0128] Then, perform the calibration training process (steps 6 to 11): Step 6: Obtain the second sample directed acyclic graph of the second sample gate-level netlist, and construct gate-level feature representations for each node in the second sample directed acyclic graph.

[0129] Step 7: Using the trained embedding representation module, process the second sample directed acyclic graph and the gate-level feature representation of each node in the second sample directed acyclic graph to obtain the embedding representation of each node in the second sample directed acyclic graph.

[0130] Step 8: Using the trained ranking module, process the embedding representation of each node in the second sample directed acyclic graph to obtain the risk ranking score of each node in the second sample directed acyclic graph.

[0131] Step 9: Based on the risk ranking scores of each node in the second sample directed acyclic graph, select the top k high-risk gates or high-risk triggers from each logic gate or flip-flop corresponding to each node in the second sample directed acyclic graph.

[0132] Step 10: Process the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers in the graph neural network to be trained for gate-level failure rate prediction, and obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers.

[0133] Step 11: Calculate the linear-logarithmic dual-domain loss based on the gate-level failure rate of each of the first k high-risk gates or high-risk triggers in the sample and the gate-level failure rate label of the first k high-risk gates or high-risk triggers in the second sample directed acyclic graph, and update the model parameters of the trained embedding representation module and the calibration module to be trained.

[0134] Repeat steps 7 to 11 above to train the model. Finally, under the condition that training is completed, the trained embedding representation module and the trained calibration module are obtained.

[0135] Thus, through a two-stage training paradigm decoupled by "ranking-calibration," ranking training enables the model to acquire reliable risk screening capabilities, while calibration training focuses on numerical refinement of a small number of key targets. Both stages share the same backbone network and are jointly optimized by different loss functions, ultimately resulting in a unified model that can accurately identify high-risk gates and output quantified failure rate estimates, providing a complete model foundation for efficient and reliable selective hardening.

[0136] like Figure 4 As shown, Figure 4 This is an overall flowchart of a method for predicting gate-level failure rates based on graph neural networks provided in an embodiment of this application. Specifically, the method includes the following steps: Step 1, Model and Data Preparation: Load a pre-trained graph neural network for gate-level failure rate prediction. This method includes an embedding representation module, a sorting module, and a calibration module. Simultaneously, read in the gate-level netlist to be evaluated, parse the gate-level netlist to be evaluated, convert it into a directed acyclic graph to be evaluated, and construct gate-level feature representations for each node in the directed acyclic graph to be evaluated, thus completing the transformation from the original netlist to the normalized input data.

[0137] Step 2, Graph Feature Extraction and Embedding Representation: The embedding representation module processes the gate-level feature representations of each node in the directed acyclic graph to be evaluated to obtain the embedding representation of that node. Specifically, based on the directed edges from the first node to the second node in the directed acyclic graph to be evaluated, the predecessor node of the second node in the forward propagation phase is determined to be the first node, and the second node in the backward propagation phase is determined to be the predecessor node of the first node; based on the embedding representations of the multiple predecessor nodes of the node in the forward propagation phase, the attention weights of the multiple predecessor nodes of the node in the forward propagation phase are determined; based on the embedding representation of the node, the embedding representations of the multiple predecessor nodes of the node in the forward propagation phase are weighted and accumulated according to the attention weights of the multiple predecessor nodes of the node in the forward propagation phase to obtain the initial updated embedding representation of the node; based on the initial updated embedding representations of the multiple predecessor nodes of the node in the backward propagation phase, the attention weights of the multiple predecessor nodes of the node in the backward propagation phase are determined; based on the initial updated embedding representation of the node, the initial updated embedding representations of the multiple predecessor nodes of the node in the backward propagation phase are weighted and accumulated according to the attention weights of the multiple predecessor nodes of the node in the backward propagation phase to obtain the embedding representation of the node.

[0138] In this process, different updaters are used to update the embedded representation of the node. For example, when the node is of type NOT gate, the first updater in the embedded representation module is used to perform any update, and the update strategy of the first updater is adapted to the propagation characteristics of the inverted logic; when the node is of type AND-LIKE gate, the second updater in the embedded representation module is used to perform any update, and the update strategy of the second updater is adapted to the propagation characteristics of the non-inverted logic.

[0139] Step 3: The embedding representation of each node in the directed acyclic graph to be evaluated is processed by the sorting module to obtain the risk ranking score of each node in the directed acyclic graph to be evaluated; where the higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or flip-flop corresponding to the node.

[0140] Step 4, Risk Ranking and Key Gate Screening: Based on the risk ranking scores of each node in the directed acyclic graph to be evaluated, select the top k high-risk gates or high-risk triggers from the logic gates or triggers corresponding to each node in the directed acyclic graph to be evaluated.

[0141] Step 5: Using the calibration module, the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers are processed to obtain the gate-level failure rate of each of the first k high-risk gates or high-risk triggers. This gate-level failure rate can be used to guide subsequent selective hardening decisions and reliability assessments.

[0142] In this way, the decoupled design of sorting-screening-calibration ensures the accuracy of identification and quantification of key risk units, forming a complete closed loop from netlist input to hardened decision support.

[0143] The following section uses specific experimental data to illustrate the gate-level failure rate prediction method based on graph neural networks proposed in this application.

[0144] First, a systematic experiment was conducted on standard logic circuit benchmarks (b07, b11, b12, b15, b17) and compared with existing methods DeepGate and BFIT to verify the accuracy of the proposed method in critical gate identification, FIT numerical prediction, and selective hardening efficiency, as well as the significant speed improvement over standard methods without sacrificing accuracy.

[0145] As shown in Table 1, on standard logic circuit benchmarks, our proposed method significantly outperforms the deep learning baseline method DeepGate in both gate-level risk identification and FIT prediction accuracy. Even in the smaller b07 circuit (575 gates), our method achieves an accuracy of 0.72, far exceeding DeepGate's 0.21, indicating that it can stably model soft error propagation and shielding effects even in compact circuit topologies. For the large-scale b17 circuit (25415 gates), our method maintains an accuracy of 0.83, better than DeepGate's 0.34. This demonstrates that our proposed method's bidirectional asynchronous topology GNN, attention aggregation mechanism, gate type-differentiated update strategy, and framework combining RankNet to select Top-K% critical gates and CalibNet for output calibration, can maintain high prediction accuracy in large-scale netlists. This allows for more accurate identification of high-risk gates, avoids over-hardening, helps reduce area and power consumption overhead, and improves the reliability of functional safety assessment and hardening decisions.

[0146] Table 1. Accuracy comparison results of this method with other baseline methods.

[0147] In terms of hardening efficiency, this method demonstrates a significant advantage in the key performance indicator (MHF). As shown in Tables 2 and 3, taking a large-scale circuit b15 as an example, to achieve a 50% reduction in total FIT, DeepGate needs to harden approximately 95% of the gates, almost randomly selected; while this method only needs to harden 17%, reducing hardening overhead by more than 80%. Under the more stringent target of a 90% FIT reduction, this method only needs to harden approximately 3% of the gates, demonstrating a stable ability to identify high-risk gates in extreme long-tail distributions. This indicates that this method has significant advantages throughout the entire process of "critical gate identification - accurate FIT estimation - guided hardening," enabling a substantial improvement in system reliability with less hardware overhead.

[0148] Table 2 Comparison of hardening costs between this method and other baseline methods MHF (ρ=50%)

[0149] Table 3. Comparison of hardening costs between this method and other baseline methods (MHF, ρ=90%)

[0150] In terms of evaluation speed, this method offers an order-of-magnitude improvement over the traditional physical simulation method BFIT. As shown in Table 4, for a medium-sized circuit b11, BFIT takes 21.4 seconds, while this method only takes 0.29 seconds, a speedup of 73.8 times; for a large-scale circuit b15, BFIT takes 315.6 seconds, while this method only takes 1.42 seconds, a speedup of over 222 times. The results demonstrate that this method achieves second-level evaluation while maintaining high accuracy, significantly improving the engineering feasibility of reliability analysis and iterative optimization for large-scale circuits.

[0151] Table 4. Speed ​​Comparison between this Method and the BFIT Method

[0152] In summary, this method can achieve more reliable critical gate sequencing and more accurate failure rate calibration in long-tailed gate-level FIT prediction tasks. It significantly reduces the proportion of hardened gates and saves hardware costs while achieving the same reliability target. Furthermore, it significantly improves processing efficiency and deployment practicality through evaluation acceleration, making selective hardening decisions more accurate, stable, and easier to implement.

[0153] This application also provides an apparatus for predicting gate-level failure rates based on graph neural networks, referring to... Figure 5 As shown, Figure 5 This is a schematic diagram of a device for predicting gate-level failure rates based on graph neural networks, provided in an embodiment of this application. The device includes: The feature extraction module 510 is used to obtain the directed acyclic graph to be evaluated of the gate-level netlist to be evaluated, and to construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated. The embedding representation module 520 is used to process the directed acyclic graph to be evaluated and the gate-level feature representation of each node in the directed acyclic graph to be evaluated, so as to obtain the embedding representation of each node in the directed acyclic graph to be evaluated. The sorting module 530 is used to process the embedded representation of each node in the directed acyclic graph to be evaluated to obtain the risk ranking score of each node in the directed acyclic graph to be evaluated; the higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or flip-flop corresponding to the node; and, based on the risk ranking score of each node in the directed acyclic graph to be evaluated, to select the top k high-risk gates or high-risk flip-flops from each logic gate or flip-flop corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero; The calibration module 540 is used to process the embedded representation of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers.

[0154] It is understood that the gate-level failure rate prediction device based on graph neural networks in the embodiments of this application can implement the gate-level failure rate prediction method based on graph neural networks in the above embodiments. The gate-level failure rate prediction device based on graph neural networks has the same advantages as the gate-level failure rate prediction method based on graph neural networks in the prior art, and will not be repeated here.

[0155] This application also provides an electronic device, see embodiments thereof. Figure 6 , Figure 6 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. For example... Figure 6 As shown, the electronic device 600 includes a memory 610 and a processor 620. The memory 610 and the processor 620 are connected via a bus for communication. The memory 610 stores a computer program that can run on the processor 620 to implement the steps of the gate-level failure rate prediction method based on graph neural networks described in the embodiments of this application.

[0156] This application also provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the gate-level failure rate prediction method based on graph neural networks described in this application.

[0157] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the gate-level failure rate prediction method based on graph neural networks described in this application.

[0158] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

[0159] This application describes embodiments of methods and apparatus according to flowchart illustrations and / or block diagrams. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0160] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0161] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0162] Although preferred embodiments of the present application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present application.

[0163] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.

[0164] The above provides a detailed description of the method, apparatus, device, and medium for predicting gate-level failure rate based on graph neural networks provided in this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and its core ideas. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A method for predicting gate-level failure rates based on graph neural networks, characterized in that, include: Obtain the directed acyclic graph of the gate-level netlist to be evaluated, and construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated; The embedded representation module in the pre-trained graph neural network for gate-level failure rate prediction is used to process the directed acyclic graph to be evaluated and the gate-level feature representation of each node in the directed acyclic graph to be evaluated, so as to obtain the embedded representation of each node in the directed acyclic graph to be evaluated. The ranking module in the graph neural network used for gate-level failure rate prediction processes the embedded representation of each node in the directed acyclic graph to be evaluated, and obtains the risk ranking score of each node in the directed acyclic graph to be evaluated. The higher the risk ranking score of a node, the higher the priority of strengthening the logic gates or flip-flops corresponding to that node. Based on the risk ranking score of each node in the directed acyclic graph to be evaluated, the top k high-risk gates or high-risk triggers are selected from each logic gate or trigger corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero. The calibration module in the graph neural network used for gate-level failure rate prediction processes the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each of the first k high-risk gates or high-risk triggers.

2. The method according to claim 1, characterized in that, The directed acyclic graph to be evaluated and the gate-level feature representations of each node in the directed acyclic graph to be evaluated are processed to obtain the embedding representations of each node in the directed acyclic graph to be evaluated, including: Based on the directed edges from the first node to the second node in the directed acyclic graph to be evaluated, the predecessor node of the second node in the forward propagation phase is determined to be the first node, and the predecessor node of the second node in the backward propagation phase is determined to be the first node. For each node in the directed acyclic graph to be evaluated, with the goal of simulating the forward propagation process of a fault from a logic gate to each flip-flop, the embedding representation of the node is initially updated based on the embedding representation of the predecessor node of the node during the forward propagation phase. For each node in the directed acyclic graph to be evaluated, with the goal of simulating the backpropagation process of a fault self-triggered trigger back to each logic gate, the embedding representation of the node is updated again based on the initial updated embedding representation of the predecessor node of the node in the backpropagation stage, so as to obtain the embedding representation of the node.

3. The method according to claim 2, characterized in that, Based on the embedding representation of the node's predecessor during the forward propagation phase, the node's embedding representation is initially updated, including: Based on the embedding representations of the node's multiple predecessor nodes during the forward propagation phase, determine the attention weights of the node's multiple predecessor nodes during the forward propagation phase. Based on the embedding representation of the node, the embedding representations of the multiple predecessor nodes of the node in the forward propagation stage are weighted and accumulated according to their respective attention weights to obtain the initial updated embedding representation of the node. Based on the initial updated embedding representation of the node's predecessor node during the backpropagation phase, the initial updated embedding representation of the node is updated again to obtain the node's embedding representation, including: Based on the initial updated embedding representations of the node's multiple predecessor nodes during the backpropagation phase, determine the attention weights of the node's multiple predecessor nodes during the backpropagation phase. Based on the initial updated embedding representation of the node, the initial updated embedding representations of the multiple predecessor nodes of the node in the backpropagation phase are weighted and accumulated according to their respective attention weights in the backpropagation phase to obtain the embedding representation of the node.

4. The method according to claim 2, characterized in that, Constructing gate-level feature representations for each node in the directed acyclic graph to be evaluated, including: For each node in the directed acyclic graph to be evaluated, the type feature representation of the logic gate or flip-flop corresponding to the node is determined according to the type of the node; For each node in the directed acyclic graph to be evaluated, the topological feature representation of the logic gate or flip-flop corresponding to the node is determined based on the topological information of the node. For each node in the directed acyclic graph to be evaluated, the structural feature representation of the corresponding logic gate or flip-flop is determined based on the structural information of the node. For each node in the directed acyclic graph to be evaluated, any update to obtain the embedding representation of that node is performed according to the following steps: For each node in the directed acyclic graph to be evaluated, if the node is of type NOT gate, the first updater in the embedded representation module is used to perform any update, and the update strategy of the first updater is adapted to the propagation characteristics of the inverse logic. For each node in the directed acyclic graph to be evaluated, if the node is of type AND-LIKE gate, an update is performed by the second updater in the embedded representation module. The update strategy of the second updater is adapted to the propagation characteristics of non-inverting logic.

5. The method according to claim 1, characterized in that, The method further includes: Obtain the first sample directed acyclic graph of the first sample gate-level netlist, and construct gate-level feature representations for each node in the first sample directed acyclic graph; Based on the gate-level failure rate labels of each logic gate or flip-flop corresponding to each node in the first sample directed acyclic graph, a sorting label is generated for each pair of nodes in the first sample directed acyclic graph. By using the embedding representation module in the graph neural network to be trained, the first sample directed acyclic graph and the gate-level feature representation of each node in the first sample directed acyclic graph are processed to obtain the embedding representation of each node in the first sample directed acyclic graph. The embedding representation of each pair of nodes in the directed acyclic graph of the first sample is processed by the ranking module in the graph neural network to be trained, and the ranking prediction result of the pair of nodes is obtained. Based on the ranking prediction result of the pair of nodes and the ranking label of the pair of nodes, the pair ranking loss value is determined. Based on the pairwise ranking loss value, the model parameters of the embedding representation module and the ranking module to be trained are updated to obtain the trained embedding representation module and the trained ranking module.

6. The method according to claim 5, characterized in that, After obtaining the trained embedding representation module and the trained sorting module, the method further includes: Obtain the second sample directed acyclic graph of the second sample gate-level netlist, and construct gate-level feature representations for each node in the second sample directed acyclic graph; The trained embedding representation module processes the second sample directed acyclic graph and the gate-level feature representations of each node in the second sample directed acyclic graph to obtain the embedding representations of each node in the second sample directed acyclic graph. The trained ranking module processes the embedding representation of each node in the second sample's directed acyclic graph to obtain the risk ranking score of each node in the second sample's directed acyclic graph. Based on the risk ranking scores of each node in the second sample directed acyclic graph, select the top k high-risk gates or high-risk triggers from each logic gate or flip-flop corresponding to each node in the second sample directed acyclic graph. The calibration module in the graph neural network to be trained for gate-level failure rate prediction processes the embedded representations of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers. Based on the gate-level failure rate of each of the first k high-risk gates or high-risk triggers and the gate-level failure rate labels of the first k high-risk gates or high-risk triggers in the second sample directed acyclic graph, the model parameters of the trained embedding representation module and the calibration module to be trained are updated to obtain the trained embedding representation module and the trained calibration module. The pre-trained graph neural network for gate-level failure rate prediction includes: a trained embedding representation module, a trained sorting module, and a trained calibration module.

7. The method according to any one of claims 1-6, characterized in that, After obtaining the gate-level failure rate of each of the first k high-risk gates or high-risk triggers, the method further includes: According to the gate-level failure rate from high to low, the first k high-risk gates or high-risk triggers are reinforced one by one. After each hardening operation is performed on a high-risk gate or high-risk trigger, it is checked whether the total failure rate of the circuit to be evaluated corresponding to the gate-level netlist to be evaluated is lower than the target threshold. If the total failure rate of the circuit under evaluation is not lower than the target threshold, continue to strengthen the next high-risk gate or high-risk trigger one by one until the total failure rate of the circuit under evaluation is lower than the target threshold.

8. A device for predicting gate-level failure rate based on graph neural networks, characterized in that, include: The feature extraction module is used to obtain the directed acyclic graph to be evaluated of the gate-level netlist to be evaluated, and to construct a gate-level feature representation for each node in the directed acyclic graph to be evaluated. The embedding representation module is used to process the directed acyclic graph to be evaluated and the gate-level feature representation of each node in the directed acyclic graph to be evaluated, so as to obtain the embedding representation of each node in the directed acyclic graph to be evaluated. The sorting module is used to process the embedding representation of each node in the directed acyclic graph to be evaluated, and obtain the risk ranking score of each node in the directed acyclic graph to be evaluated. The higher the risk ranking score of a node, the higher the priority of strengthening the logic gate or trigger corresponding to that node; and, based on the risk ranking scores of each node in the directed acyclic graph to be evaluated, the top k high-risk gates or high-risk triggers are selected from each logic gate or trigger corresponding to each node in the directed acyclic graph to be evaluated, where k is an integer greater than zero. The calibration module is used to process the embedded representation of the first k nodes corresponding to the first k high-risk gates or high-risk triggers to obtain the gate-level failure rate of each high-risk gate or high-risk trigger in the first k high-risk gates or high-risk triggers.

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method for predicting gate-level failure rates based on graph neural networks as described in any one of claims 1-7.

10. A readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the steps of the method for predicting gate-level failure rates based on graph neural networks as described in any one of claims 1-7.