A method and apparatus for transient modeling of SiC MOSFET module switching based on graph structure embedding
By combining graph structure embedding and graph attention network with multi-kernel convolutional fusion network, the problem of complex parameter extraction in the transient modeling of SiC MOSFET switching is solved. It achieves efficient and accurate estimation under different operating conditions and temperature conditions, reduces testing costs and complexity, and improves modeling efficiency and applicability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HARBIN ENG UNIV
- Filing Date
- 2026-03-14
- Publication Date
- 2026-06-19
AI Technical Summary
Existing transient modeling methods for SiC MOSFET switches rely on complex physical models or equivalent circuit models, making it difficult to quickly and accurately obtain the dynamic characteristics of devices under different operating conditions in practical engineering. Furthermore, the modeling process is complex and parameter extraction is difficult, which makes it difficult to meet the needs of rapid engineering applications.
By employing a method based on graph structure embedding and graph attention network (GAT) combined with multi-kernel convolutional fusion network (MKCFN), we can achieve efficient estimation and completion of transient waveforms of SiC MOSFET switching under different operating conditions using limited dual-pulse test data, thereby reducing modeling difficulty and improving the applicability of the model.
It significantly reduces the number of experimental tests, improves modeling efficiency and engineering applicability, can accurately estimate switching transient waveforms under different operating conditions and temperatures, reduces testing costs, and improves the efficiency of SiC MOSFET switching transient characteristic modeling and evaluation.
Smart Images

Figure CN122241177A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of power electronics technology, and particularly relates to a method and apparatus for transient modeling of SiC MOSFET module switching based on graph structure embedding. Background Technology
[0002] Silicon carbide (SiC) metal-oxide-semiconductor field-effect transistors (MOSFETs) are widely used in high-power power electronics (PE) applications such as electric vehicles (EVs) and railway trains due to their superior characteristics, such as lower on-resistance, higher switching speed, higher switching frequency, and higher operating temperature. Since SiC MOSFETs play a crucial role in PE systems, system designers and application researchers need a comprehensive understanding of their dynamic characteristics to assess their power dissipation and determine whether they meet relevant requirements. However, the parameters given in specifications are measured under specific conditions. External parameters in real-world applications vary significantly and frequently, making some parameters unusable directly. Using suboptimal or improper parameters can lead to SiC MOSFET failure and potentially serious consequences. The most effective method for evaluating the switching characteristics of SiC MOSFETs and observing their parameters is the dual-pulse test (DPT).
[0003] The dual-pulse test (DPT) method is a standard approach for evaluating and characterizing the dynamic switching and reverse recovery characteristics of SiC MOSFETs. This method is based on a half-bridge circuit and evaluates the device's performance at specific voltages and currents through two switching operations. During the DPT test, a wide drive pulse is first applied to the gate of the SiC MOSFET to turn it on, at which point the inductor current begins to rise linearly, establishing the required inductor current for the test. Then, an inductor is connected in parallel across the upper MOSFET to test the characteristics of the lower MOSFET and the reverse diode characteristics of the upper MOSFET. Engineers can analyze the waveform results of the dual-pulse test and use the obtained data to build alternative models to analyze the dynamic performance of SiC MOSFETs in power electronic systems. Key parameters describing the switching behavior of SiC MOSFETs include the on-time t. on Shutdown time t off Reverse recovery time t rr Conduction delay time t don Shutdown delay time t doff Conduction loss E on and shutdown loss E off However, due to the limitations of DPT in practical power electronics applications, several estimation methods have been proposed to analyze these electrical characteristics of SiC MOSFETs.
[0004] With the widespread application of silicon carbide (SiC) power devices in high-voltage, high-frequency, and high-power-density power electronic systems, accurate modeling and calculation of their dynamic switching characteristics have become a crucial foundation for device design, drive circuit optimization, and system-level simulation. Currently, the technical solutions used to calculate and analyze the dynamic switching characteristics of SiC MOSFETs can be mainly divided into three categories: behavioral models, physics-based models, and numerical models. Among them, behavioral models typically describe the device switching process based on equivalent circuits or empirical formulas, and are characterized by simple structure, fewer parameters, and fast calculation speed, thus finding widespread application in transient circuit simulation. Existing technologies have improved traditional MOSFET first-level models in commercial circuit simulation software (such as SPICE) to suit the dynamic characteristic analysis of high-voltage SiC MOSFETs. However, these models usually require extracting key parameters based on device test curves, especially in the modeling of nonlinear capacitances such as gate-drain capacitance. The parameter extraction process is complex and it is difficult to guarantee consistent accuracy across different operating ranges. To improve the fitting accuracy of the model in the high-voltage operating range, existing technologies have proposed introducing improved behavioral models based on the device's IV characteristics and combining them with nonlinear optimization algorithms to achieve automatic parameter extraction. Although this method improves the model's versatility to some extent, in practical applications, the accuracy of automatically extracted parameters is still limited by the quality of test data and the selection of optimized initial values. It often requires manual adjustment, which not only increases the modeling cost but also relies heavily on the professional experience of the modelers, making it difficult to meet the needs of rapid engineering applications.
[0005] Furthermore, physics-based models, by incorporating internal physical mechanisms of the device, describe carrier transport and resistance variations, and can reflect the dynamic behavior of the device under different bias conditions to a certain extent. Existing research has proposed compact modeling methods for the bias-dependent characteristics of the drift region resistance in SiC MOSFETs, providing a new approach to improving the physical consistency of the model. However, these models typically focus on describing the forward conduction and switching processes, insufficiently considering the third-quadrant operating characteristics and their impact on dynamic switching behavior, thus limiting their applicability in full-condition simulations. Moreover, traditional simulation methods not only require detailed physical parameters of the SiC MOSFET but also rely on complex calculations, leading to deployment difficulties.
[0006] In summary, existing transient switching modeling methods mostly rely on physical device models or equivalent circuit models. These methods are highly dependent on the internal structural parameters and parasitic parameters of the device, resulting in complex modeling processes and difficulties in accurately obtaining model parameters, thus limiting their widespread application in practical engineering. There is an urgent need for a technical solution that can reduce modeling difficulty and expand the applicability of the model while ensuring computational efficiency. Summary of the Invention
[0007] This invention addresses the challenge of rapidly and comprehensively acquiring the switching transient dynamic characteristics of silicon carbide metal-oxide-semiconductor field-effect transistors (SiC MOSFETs) in high-power power electronic systems. It proposes a method and apparatus for modeling SiC MOSFET switching transients based on graph-structured embedding. By introducing a Graph Attention Network (GAT) and a Multi-Kernel Convolution Fusion Network (MKCFN), efficient estimation and completion of SiC MOSFET switching transient waveforms under different operating conditions are achieved based on limited double-pulse test data. This significantly reduces experimental testing workload and improves modeling efficiency.
[0008] To achieve the above objectives, the present invention provides the following technical solution:
[0009] This invention provides a method for transient switching modeling of SiC MOSFET modules based on graph structure embedding. The method uses dual-pulse test data as a basis to model the dynamic behavior of switching transients under untested conditions with limited measured data, obtaining a multi-kernel convolutional graph attention model. Specifically, it includes the following steps: A dual-pulse test was performed on the SiC MOSFET module under different operating conditions to obtain the switching transient waveform data corresponding to each operating condition. The switching transient waveform data were then used to train a model to obtain a complete switching transient estimation result for a single operating condition. The switching transient waveform data included waveforms related to gate-source voltage, drain-source voltage, drain current, and losses. Each double-pulse test working condition is mapped to a node in a graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes. Using a multi-kernel convolutional fusion network, multi-scale feature extraction and weighted fusion are performed on the switching transient waveform data corresponding to the tested working condition nodes to obtain the feature representation of each node in the graph structure. The feature representation of the node is input into the graph attention network. The weights between nodes in the graph structure adjacency matrix are dynamically calculated through the attention mechanism in the graph attention network. The node features are updated through cross-node feature propagation and fusion. Furthermore, a residual connection structure is introduced between the input and output of the graph attention network to avoid gradient vanishing during training. Based on the updated node characteristics, the switching transient waveform estimation results of the SiC MOSFET under the target operating conditions are output.
[0010] In one implementation, the graph structure is represented as G=(P,E), where It is a set of N nodes, corresponding to N types of double-pulse test operating conditions, where E is the set of edges between nodes; it is a graph-structured adjacency matrix. Each element in , Representative node With nodes The connection weight between them is assigned a default value of 1. Represents node variables, Represents the space of real numbers with N rows and N columns.
[0011] In one embodiment, the operating conditions of the dual-pulse test consist of one or more parameters selected from DC bus voltage, load current, gate resistance, gate drive voltage, DC link stray inductance, and SiC MOSFET module operating temperature.
[0012] In one implementation, the multi-kernel convolutional fusion network includes three one-dimensional convolutional networks with kernel sizes of 3, 7, and 11. The small-scale convolutional kernels capture the local features of the rising and falling edges of the switching transient waveform, the medium-scale convolutional kernels capture the medium-scale features of the waveform shape, and the large-scale convolutional kernels capture the global features of the overall switching transient signal's trend and structure.
[0013] In one implementation, the step of using a multi-kernel convolutional fusion network to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested operating condition nodes to obtain the feature representation of each node in the graph structure includes: The input transient waveform data of the switch is processed by position encoding and then input into one-dimensional convolutional networks. The convolutional output features are calculated through the convolutional layers. The output features of the convolutional layers of each one-dimensional convolutional network are input into the feedforward network for processing to obtain the final output feature representation of the corresponding one-dimensional convolutional neural network; the feedforward network contains two fully connected layers and uses ReLU as the activation function. The feature representations of each one-dimensional convolutional network are weighted, summed, and fused to obtain the fused node feature representation.
[0014] In one implementation, the output of each one-dimensional convolutional neural network layer Calculated in the following way: , Where k is the kernel size of the one-dimensional convolutional layer, and k takes the values 3, 7, and 11 for the three one-dimensional convolutional networks, t is the time step index of the temporal signal, s is the sliding index of the convolutional kernel, and X i For the i-th input, the switching transient waveform data, and These represent the stride and padding of a 1D convolutional layer, respectively. b represents the weights of each convolutional layer, and b represents the bias. The weight parameters corresponding to the classification heads will be used to control the output of the convolutional layer. The output feature representation of the corresponding one-dimensional convolutional neural network is obtained by processing through a feedforward network. The feedforward network contains two fully connected (FC) layers: , in, It is a linear rectified activation function. This is the weight matrix of the fully connected layer in the feedforward network. FFN represents the bias term of the fully connected layer in the feedforward network. Three convolutional neural networks Merge according to the following strategies: , in, This represents the weight of the j-th convolutional network module, where j=1,2,3 correspond to one-dimensional convolutional networks with kernel sizes of 3, 7, and 11, respectively. The fused feature representation of the double-pulse test sequence will be used as the update input of the double-pulse test-based module. i represents the index of the input sample, Fusion represents the weight coefficient of fusing the output features of different convolutional modules, Fused represents the fused feature, and CNN represents the convolutional neural network.
[0015] In one implementation, the weights between nodes in the adjacency matrix of the graph structure are dynamically calculated using the attention mechanism in the graph attention network, and the node features are updated through cross-node feature propagation and fusion, including: The node features are expanded in dimension by sharing a parameter matrix, and the original attention coefficients between nodes are calculated using a single-layer feedforward neural network. The original attention coefficients are normalized using the softmax function to obtain the normalized attention coefficients between nodes; A multi-head attention mechanism is adopted, which aggregates the features of neighboring nodes based on normalized attention coefficients to obtain the updated feature representation of the nodes, thereby improving the stability of the model learning process.
[0016] In one implementation, the correlation between nodes i is determined by neighboring nodes in the graph attention layer. The attention coefficient between them is used to represent: , in, Let be the attention coefficient between node i and its neighboring node j. It is a shared parameter matrix used to augment the dimensions of node features. Indicates line F The column space of real-valued matrices; F is the number of features for each node; It is a new set of node features; It is a set of node features. express In a 3D real vector space, AM is implemented by a single-layer feedforward neural network with a weight vector of... , Indicates 2 A 3D real vector space, with LeakyReLU as the activation function;
[0017] The attention coefficient is calculated using the softmax function for all neighboring node j values. Normalize: , in, For vectors transpose, Indicates the join operation. It is the set of neighboring nodes of node i. One of the nodes, based on normalized attention coefficients , For the first The original feature vectors of each node in the dual-pulse test condition. For nodes The original characteristics; For the first The original feature vectors of each double-pulse test condition node; Feature aggregation : , in, The activation function applied; Use a multi-head attention mechanism to compute the output feature representation: , in, The number of independent attention mechanisms, for The k-th normalized attention coefficient, This is the weight matrix for the corresponding input linear transformation.
[0018] In one implementation, the method further includes a model generalization step: Select room temperature The model is trained using double-pulse test data under the given conditions, and generalization constraints are performed using experimental data under other temperature conditions as supervision signals. This enables the trained multi-kernel convolutional graph attention model to estimate the switching transient dynamic characteristics of SiC MOSFET modules under different temperature conditions without the need for retraining. For different SiC MOSFET devices of the same type or with similar rated parameters and package types, the diagram structure remains unchanged. Utilizing the room temperature of the target SiC MOSFET device Double pulse test dataset The trained model is fine-tuned to estimate the switching transient characteristics of the target device. This represents the set of nodes in the target device graph structure. This represents the set of edges in the target device graph structure. The diagram shows the structure corresponding to the target SiC MOSFET device.
[0019] The present invention also provides a transient modeling device for switching of a SiC MOSFET module based on graph structure embedding, the device comprising: The data acquisition module is used to perform dual-pulse testing on the SiC MOSFET module under different operating conditions, acquire switching transient waveform data corresponding to each operating condition, and perform model training on the switching transient waveform data to obtain complete switching transient estimation results under a single operating condition; the switching transient waveform data includes waveforms related to gate-source voltage, drain-source voltage, drain current, and losses; The graph structure construction module is used to map each double-pulse test working condition as a node in the graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes. The feature extraction and fusion module is used to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested working condition nodes using a multi-kernel convolutional fusion network, so as to obtain the feature representation of each node in the graph structure. The feature aggregation module is used to input the feature representation of the nodes into the graph attention network. It dynamically calculates the weights between nodes in the graph structure adjacency matrix through the attention mechanism in the graph attention network. It completes the update of node features through cross-node feature propagation and fusion. It also introduces a residual connection structure between the input and output of the graph attention network to avoid gradient vanishing during training. The transient waveform estimation module is used to output the switching transient waveform estimation results of SiCMOSFET under the target operating conditions based on the fused and updated node characteristics.
[0020] Beneficial effects of this invention:
[0021] This invention addresses the challenge of efficiently acquiring the transient dynamic characteristics of SiC MOSFETs under various operating conditions using finite double-pulse testing. It introduces graph embedding and graph attention mechanisms to model the inherent relationships between different operating conditions, achieving accurate estimation of the transient waveform under untested conditions with only a small amount of measured data. Simultaneously, it combines a multi-kernel convolutional fusion network to extract and fuse multi-scale features of the transient waveform, effectively improving the model's ability to express key features such as rapid transitions and high-frequency oscillations. Furthermore, a model generalization strategy enables the trained model to be applied to different devices and operating temperatures, significantly reducing the number of experimental tests, lowering testing costs, and improving the efficiency and engineering applicability of SiC MOSFET switching transient characteristic modeling and evaluation. Attached Figure Description
[0022] The accompanying drawings, as part of this invention, are provided to further illustrate the invention. The illustrative embodiments and descriptions of the invention are used to explain the invention, but do not constitute an undue limitation thereof. Clearly, the drawings described below are merely some embodiments, and those skilled in the art can obtain other drawings based on these drawings without any creative effort.
[0023] Figure 1 A flowchart illustrating a method for transient switching of a SiC MOSFET module based on graph structure embedding, provided in an embodiment of the present invention;
[0024] Figure 2 This is a schematic diagram of a switching transient modeling method combining a multi-kernel convolutional fusion network and a graph attention network, provided in one embodiment of the present invention.
[0025] Figure 3 This is a schematic diagram of the MKCFN module of the main network module MKCGAM provided in one embodiment of the present invention;
[0026] Figure 4 A schematic diagram of model generalization learning provided in one embodiment of the present invention;
[0027] Figure 5 This is an experimental result of the untested start-up transient provided in one embodiment of the present invention;
[0028] Figure 6 This is an experimental result of the shutdown under untested operating conditions provided in one embodiment of the present invention;
[0029] Figure 7 A schematic diagram of a SiC MOSFET module switching transient modeling device based on graph structure embedding is provided in one embodiment of the present invention.
[0030] It should be noted that these accompanying drawings and textual descriptions are not intended to limit the scope of the invention in any way, but rather to illustrate the concept of the invention to those skilled in the art by referring to specific embodiments. Detailed Implementation
[0031] To make the objectives and technical solutions of this invention clearer, the invention will be described in detail below with reference to the accompanying drawings and embodiments.
[0032] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments will be clearly and completely described below with reference to the accompanying drawings. The following embodiments are used to illustrate the present invention, but are not intended to limit the scope of the present invention.
[0033] In high-power power electronic systems, SiC MOSFETs are widely used due to their high switching speed, high withstand voltage, and low loss characteristics. Their switching transient dynamic characteristics have a significant impact on system efficiency, reliability, and electromagnetic compatibility. In engineering practice, a dual-pulse test method is typically used to obtain the switching transient waveforms of the device under different operating conditions. However, the switching transient behavior of SiC MOSFETs is simultaneously affected by multiple factors, including DC bus voltage, load current, gate drive parameters, DC link stray inductance, and operating temperature. Completely covering all operating condition combinations requires numerous repetitive experiments, making the testing process complex, time-consuming, and costly, which is difficult to meet the efficiency and flexibility requirements of the engineering design stage.
[0034] With the development of deep learning technology, the industry is increasingly benefiting from these data-driven algorithms. The DPT waveforms under different DC bus voltages, temperatures, and load currents can be regarded as timing signals. Therefore, it is hoped that specific deep learning methods can be used to predict these signals, thereby deriving the dynamic parameters of SiC MOSFETs.
[0035] Based on this, the present invention proposes a modeling scheme for the transient dynamic characteristics of SiC MOSFET switches that is based on experimental data and combines graph structure modeling and deep learning methods, so as to achieve rapid estimation of DPT waveform under unmeasured conditions, thereby significantly reducing the number of experiments and improving testing efficiency.
[0036] This invention addresses the challenge of fully capturing the switching transient dynamic characteristics of silicon carbide metal-oxide-semiconductor field-effect transistor (SiC MOSFET) modules under different operating conditions using limited double-pulse test data. Considering that traditional modeling methods based on physical parameters or equivalent circuits rely on complex parameter extraction and cumbersome calculations, failing to meet the demands for rapid estimation and high generalization in engineering applications, this invention proposes a method for modeling and improving the efficiency of SiC MOSFET module switching transient dynamic behavior based on graph structure embedding. This method maps double-pulse test data under different operating conditions to nodes in a graph structure and utilizes a graph attention network to model the correlation between operating conditions. Simultaneously, it combines a multi-kernel convolutional fusion network to extract and fuse multi-scale features of the switching transient waveform. This allows for accurate estimation of SiC MOSFET switching transient waveforms under other untested operating conditions, relying only on limited measured data. This method significantly reduces the number of double-pulse tests, lowering testing time and labor costs, while also improving the efficiency and applicability of switching transient characteristic modeling, providing a reliable basis for the design and performance evaluation of SiC MOSFETs in practical power electronic systems.
[0037] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings.
[0038] Reference Figure 1 As shown in one embodiment, a method for modeling the switching transient behavior of a SiC MOSFET module based on graph-structure embedding is provided. Using dual-pulse test data as a basis, this method models the switching transient dynamic behavior of untested operating conditions under limited measured data. The method includes the following steps:
[0039] Step S100: Perform double-pulse testing on the SiC MOSFET module under different operating conditions, obtain the switching transient waveform data corresponding to each operating condition, and complete model training on the switching transient waveform data to obtain the complete switching transient estimation results under a single operating condition.
[0040] Specifically, the switching transient waveform data of the SiC MOSFET under different operating conditions were acquired using a dual-pulse test platform. The waveform data included the gate-source voltage V. GS Drain-source voltage V DS and drain current I D (Loss-related) gas signals. Data preprocessing operations are performed on the acquired raw waveform data, including time alignment, outlier removal, and normalization. A single-condition training dataset is then constructed in time series format.
[0041] Subsequently, the transient modeling model was trained using the preprocessed switching transient waveform data. By minimizing the error between the model's predicted waveform and the measured waveform, the model parameters were iteratively optimized, enabling the model to learn the switching dynamic characteristics of SiCMOSFETs under this specific operating condition. After training, the model can continuously reconstruct the electrical quantity changes throughout the entire switching cycle under this operating condition, thereby obtaining complete switching transient estimation results for a single operating condition, providing basic data for subsequent cross-condition diagram structure modeling.
[0042] The graph structure is represented as G=(P,E), where It is a set of N nodes, corresponding to N types of double-pulse test operating conditions, where E is the set of edges between nodes; it is a graph-structured adjacency matrix. Each element in , Representative node With nodes The connection weight between them is assigned a default value of 1. Represents node variables, Represents the space of real numbers with N rows and N columns.
[0043] To ensure the reliability and accuracy of the DPT waveform estimation method, appropriate input variables and their forms must be selected. The complete input data acquired from the SiC MOSFET DPT platform can be represented as:
[0044]
[0045] These input variables include the core signal waveform. , , and The transient dynamic behavior analysis of SiC MOSFET switches requires the measurement of these waveforms, and the method ultimately needs to estimate them. One of the data types This should be used as input data for the next training step. Therefore, for The method requires four complete training iterations to obtain a complete switch transient estimate under operating conditions, involving four variables.
[0046] Step S200: Map each double-pulse test working condition to a node in a graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes.
[0047] Furthermore, the operating conditions for the dual-pulse test consist of one or more parameters, including DC bus voltage, load current, gate resistance, gate drive voltage, DC link stray inductance, and SiC MOSFET module operating temperature.
[0048] This invention presents a structured representation of DPT data under different operating conditions and uses graph embedding to characterize the intrinsic relationships between different operating conditions, thereby modeling the transient dynamic behavior of switches in a complete operating space based on limited experimental data. Figure 2 The diagram illustrates the principle of the DPT estimation problem based on graph structure embedding proposed in this invention. Each DPT operating condition is mapped to a node in the graph structure, and the correlation between different operating conditions is represented by edges. When constructing the graph structure, in addition to the feature nodes representing the switching transient data under different operating conditions, at least one label node is set to represent the target output information under the corresponding operating condition. The label node is used to store supervisory information related to the switching transient characteristics, including target features such as switching losses, turn-on time, turn-off time, or complete switching transient waveform.
[0049] The transient switching data corresponding to each set of operating conditions is mapped to feature nodes in a graph structure, and label nodes are constructed based on their corresponding target output results. Edge connections are then used to associate the feature nodes with their corresponding label nodes, thus forming labeled graph structure data. In this way, the supervision information provided by the label nodes can be used during the training of the graph neural network to guide the model to learn the changing patterns of switching transient characteristics under different operating conditions, achieving cross-condition feature propagation and modeling. By propagating and fusing node features in the graph structure, it is possible to infer the switching transient waveforms under operating conditions that have not been actually tested.
[0050] Step S300: Using a multi-kernel convolutional fusion network, perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested working condition nodes to obtain the feature representation of each node in the graph structure.
[0051] Furthermore, the multi-kernel convolutional fusion network includes three one-dimensional convolutional networks with kernel sizes of 3, 7, and 11. The small-scale convolutional kernel captures the local features of the rising and falling edges of the switching transient waveform, the medium-scale convolutional kernel captures the medium-scale features of the waveform shape, and the large-scale convolutional kernel captures the global features of the overall switching transient signal's trend and structure.
[0052] In this embodiment, a multi-kernel convolutional fusion network is used to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested operating condition nodes, resulting in feature representations of each node in the graph structure, including:
[0053] Step S310: After performing position encoding processing on the input switch transient waveform data, input it into each one-dimensional convolutional network, and obtain the convolutional output features through the convolutional layers.
[0054] A one-dimensional convolutional neural network (1DCNN) is a special type of convolutional neural network (CNN) that performs better when processing one-dimensional time series data. For example... Figure 5 As shown, the designed MKCFN, as a multi-kernel feature fusion processor, consists of three one-dimensional convolutional networks with different kernel sizes to achieve comprehensive feature fusion: small-scale kernels are used to capture local features, such as the rising and falling edges of switching transient waveforms; medium-scale kernels are used to capture medium-scale features, such as the shape of the waveform; and large-scale kernels are used to capture global features, such as the trend or overall structure of the overall switching transient signal. First, the selected dataset is input into the MKCFN module of MKCGAM to obtain the most expressive feature representation, which serves as the input to the GAT module.
[0055] Step S320: Input the output features of the convolutional layers of each one-dimensional convolutional network into the feedforward network for processing to obtain the final output feature representation of the corresponding one-dimensional convolutional neural network; the feedforward network contains two fully connected layers and uses ReLU as the activation function.
[0056] Step S330: Perform weighted summation and fusion of the feature representations of each one-dimensional convolutional network to obtain the fused node feature representation.
[0057] Three one-dimensional convolutional network modules are used, with kernel sizes k of 3, 7, and 11, respectively. Input Position encoding is required. For a kernel size k, the output of each one-dimensional convolutional neural network layer... It can be calculated in the following ways:
[0058]
[0059] Where k is the kernel size of the one-dimensional convolutional layer, and k takes the values 3, 7, and 11 for the three one-dimensional convolutional networks, t is the time step index of the temporal signal, s is the sliding index of the convolutional kernel, and X i For the i-th input, the switching transient waveform data, and These represent the stride and padding of a 1D convolutional layer, respectively. b represents the weights of each convolutional layer, and b represents the bias. This represents the weight parameters corresponding to the classification header.
[0060] The output of the convolutional layer ( The final representation needs to be obtained through a feedforward network (FFN). FFN contains two fully connected (FC) layers:
[0061]
[0062] in, It is a linear rectified activation function. This is the weight matrix of the fully connected layer in the feedforward network. FFN represents the bias term of the fully connected layer in a feedforward network.
[0063] Three convolutional neural network modules The outputs should be merged according to the following strategy:
[0064]
[0065] in, This represents the weight of the j-th convolutional network module, where j=1,2,3 correspond to one-dimensional convolutional networks with kernel sizes of 3, 7, and 11, respectively. The fused feature representation of the double-pulse test sequence will be used as the update input of the double-pulse test-based module. i represents the index of the input sample, Fusion represents the weight coefficient of fusing the output features of different convolutional modules, Fused represents the fused feature, and CNN represents the convolutional neural network.
[0066] To address the significant non-stationarity, multi-timescale characteristics, and local high-frequency oscillations in the transient waveforms of SiC MOSFET switching, this invention introduces a multi-kernel convolutional fusion network into the main network for transient feature extraction. Its specific structure is as follows: Figure 3 As shown, this network performs parallel convolution processing on the input DPT waveform by setting convolution kernels of various sizes, thereby simultaneously capturing the rapid voltage change features, high-frequency oscillation components, and overall waveform change trends during the switching process. The features processed by multi-scale convolution are fused and used as feature representations for each node in the graph structure, providing high-quality input for the subsequent association modeling of the graph attention network.
[0067] Step S400: Input the feature representation of the node into the graph attention network, dynamically calculate the weights between nodes in the graph structure adjacency matrix through the attention mechanism in the graph attention network, update the node features through cross-node feature propagation and fusion, and introduce a residual connection structure between the input and output of the graph attention network to avoid gradient vanishing during training.
[0068] In this embodiment of the application, the weights between nodes in the adjacency matrix of the graph structure are dynamically calculated through the attention mechanism in the graph attention network, and the node features are updated through cross-node feature propagation and fusion, including:
[0069] Step S410: Expand the dimension of node features by sharing the parameter matrix, and calculate the original attention coefficients between nodes using a single-layer feedforward neural network.
[0070] Step S420: Normalize the original attention coefficients using the softmax function to obtain the normalized attention coefficients between nodes.
[0071] Step S430: Employ a multi-head attention mechanism to aggregate the features of neighboring nodes based on normalized attention coefficients, thereby obtaining updated feature representations of the nodes and improving the stability of the model learning process.
[0072] A complete transient switching dataset can be embedded into a graph structure with multiple nodes. Therefore, for node i, its exact relevance can be determined through its neighboring nodes in the graph attention layer (GAL). The attention coefficient between them is used to represent:
[0073]
[0074] in, Let be the attention coefficient between node i and its neighboring node j. It is a shared parameter matrix used to augment the dimensions of node features. Indicates line F The column space of real-valued matrices; F is the number of features for each node; It is a new set of node features; It is a set of node features. express In a 3D real vector space, AM is implemented by a single-layer feedforward neural network with a weight vector of... , Indicates 2 A 3D real vector space, with LeakyReLU as the activation function;
[0075] The attention coefficient is calculated using the softmax function for all neighboring node j values. Normalize:
[0076]
[0077] in, For vectors transpose, Indicates the join operation. It is the set of neighboring nodes of node i. One of the nodes, based on normalized attention coefficients , For the first The original feature vectors of each node in the dual-pulse test condition. For nodes The original characteristics; For the first The original feature vectors of each double-pulse test condition node;
[0078] Feature aggregation :
[0079]
[0080] in, represents the activation function applied. To stabilize the learning process, we use a multi-head attention (MHA) mechanism to compute the output feature representation:
[0081]
[0082] in, The number of independent attention mechanisms, for The k-th normalized attention coefficient. This is the weight matrix for the corresponding input linear transformation.
[0083] In graph attention networks, feature vectors from different operating conditions are embedded into the same graph structure for processing. Through adaptive allocation of attention weights, the network can automatically learn the correlation between nodes under different operating conditions, achieving effective fusion of cross-condition information. During the dual-pulse test, the operating conditions involved include parameters such as DC bus voltage, load current, gate resistance, gate drive voltage, and DC link stray inductance. Combinations of different parameter values constitute a complete set of operating conditions. By conducting DPT experiments and collecting transient waveform data under some typical operating conditions, the model can estimate the switching transient waveforms under other untested operating conditions after training.
[0084] Step S500: Based on the fused and updated node characteristics, output the estimated switching transient waveform of the SiC MOSFET under the target operating conditions.
[0085] To ensure that the trained MKCGAM model only utilizes The switching transient data at a given point can be used to estimate the performance of other SiC MOSFETs. This invention proposes a model generalization strategy. To achieve optimal migration performance, this invention applies this model generalization method to different SiC MOSFETs of the same type, or to different types of SiC MOSFETs with similar rated parameters and package types.
[0086] In an optional embodiment, the method further includes a model generalization step:
[0087] Select room temperature The model is trained using double-pulse test data under the given conditions, and generalization constraints are performed using experimental data under other temperature conditions as supervision signals. This enables the trained multi-kernel convolutional graph attention model to estimate the switching transient dynamic characteristics of SiC MOSFET modules under different temperature conditions without the need for retraining.
[0088] For different SiC MOSFET devices of the same type or with similar rated parameters and package types, the diagram structure remains unchanged. Utilizing the room temperature of the target SiC MOSFET device Double pulse test dataset The trained model is fine-tuned to estimate the switching transient characteristics of the target device. This represents the set of nodes in the target device graph structure. This represents the set of edges in the target device graph structure. The diagram shows the structure corresponding to the target SiC MOSFET device.
[0089] To improve the model's applicability under different operating temperature conditions, this invention introduces a model generalization learning mechanism, the process of which is as follows: Figure 4 As shown in the figure. In this process, DPT data collected at room temperature was selected as model training data, and experimental data collected under other casing temperature conditions were used as supervision signals for generalization constraints. Since SiC MOSFET devices are relatively insensitive to temperature changes, the model after generalization learning can reliably estimate the switching transient dynamic characteristics under different temperature conditions without retraining.
[0090] To achieve generalization learning of the model, DPT data collected at room temperature is used as the primary training sample for initial training of the model parameters. Based on this, switching transient experimental data collected under other shell temperature conditions are introduced as supervisory signals into the model training process. Specifically, the switching transient waveform predicted by the model under the corresponding temperature condition is compared with the measured waveform at that temperature. By constructing a cross-temperature error loss function, the difference between the model prediction and the measured data is measured, and this error is used as a supervisory signal to backpropagate to the model parameters, thereby constraining the model's generalization.
[0091] Furthermore, by minimizing the error between the model-predicted waveform and the measured waveform under different temperature conditions, the model can learn the influence of temperature changes on the switching transient characteristics while maintaining modeling accuracy at room temperature. Since SiC MOSFET devices are relatively insensitive to temperature changes, after the above generalization learning, the model can reliably estimate the switching transient dynamic characteristics under different temperature conditions without retraining.
[0092] Since we set all elements in the adjacency matrix A to 1, we do not need to adjust the graph structure of the target SiC MOSFET during model generalization. ,Right now Furthermore, the GAL in the GAT module automatically and dynamically adjusts the attention between nodes during training, thus simplifying the steps in the generalization module to use the DPT dataset of the target SiCMOSFET. Make fine adjustments. We should note that if... The amount of data in it is permissible because exist There is no need to switch transient data, only in The following data.
[0093] In real-world testing environments, external factors such as measurement errors, DC bus stray parameter fluctuations, and parasitic parameters of freewheeling devices can significantly impact the DPT waveform. This invention introduces various typical operating condition data during the training phase, enabling the model to comprehensively learn the influence of these factors on the transient behavior of the switch. When the test platform structure and device configuration remain consistent, this method demonstrates good stability and applicability in engineering applications.
[0094] This invention embeds SiC MOSFET dual-pulse test (DPT) switching transient data under different operating conditions into a unified graph structure, and uses a graph attention network to model the correlation between operating conditions, achieving efficient estimation of switching transient behavior under untested operating conditions. (Refer to...) Figure 5 and Figure 6 As shown in the figure, V gs I represents the measured gate-source voltage. d E represents the measured drain current. on V represents the measured turn-on loss. ds _pred indicates the predicted drain-source voltage, V gs _pred represents the predicted gate-source voltage, I d _pred indicates the predicted drain current, E on _pred represents the predicted turn-on loss; Drain current spike represents the drain current spike; Rate of change of drain current represents the drain current change rate; Turn-on delay represents the turn-on delay time; Rise time represents the rise time; Rate of change of drain voltage represents the drain-source voltage change rate; HighFrequency Oscillation Frequency represents the high-frequency oscillation frequency; and Turn-on loss represents the turn-on loss. Figure 5 Experimental results for the transient turn-on under untested operating conditions: DC bus voltage, load current, gate resistance, gate drive voltage, and DC link stray inductance are T c =110 o C,V DC =700V, I L =300A, R g(on) / R g(off) =11Ω / 7.5Ω, V CC =18V, L p =78nH. Figure 6 The results for the untested shutdown condition are as follows: DC bus voltage, load current, gate resistance, gate drive voltage, and DC link stray inductance are T. c =110 o C,V DC =700V, I L =300A, R g(on) / R g(off) =11Ω / 7.5Ω, V CC =18V, L p =78nH. From Figure 5 and Figure 6 It can be seen that the proposed method has high estimation accuracy for transient waveforms of switches under unknown operating conditions, and thus the estimated dynamic characteristic parameters, such as the rate of change of voltage, are also highly accurate. Current change rate Loss E, voltage spike V pk High-frequency oscillation frequency f os Activation delay time t d(on) Shutdown delay time t d(off) descent time t f Rise time t r It is very close to the actual value.
[0095] This invention introduces a multi-kernel convolutional fusion network to extract and weightedly fuse multi-scale features in DPT transient waveforms in parallel, thereby enhancing the expressive power of switching transient features and improving modeling accuracy. This invention also proposes a model generalization strategy that enables models trained on finite DPT data to be applied to the estimation of switching transient characteristics of different SiC MOSFET devices and under different operating temperature conditions while maintaining the graph structure.
[0096] The following is an embodiment of a graph-structure-embedded transient modeling device for SiC MOSFET module switching, which can be used to execute an embodiment of a graph-structure-embedded transient modeling method for SiC MOSFET module switching, according to the present invention. For details not disclosed in the embodiment of the graph-structure-embedded transient modeling device for SiC MOSFET module switching, please refer to the embodiment of the graph-structure-embedded transient modeling method for SiC MOSFET module switching, according to the present invention.
[0097] Reference Figure 5 As shown, in one embodiment, a transient modeling device for switching of a SiC MOSFET module based on graph structure embedding is provided. The device includes: The data acquisition module is used to perform dual-pulse testing on the SiC MOSFET module under different operating conditions, acquire switching transient waveform data corresponding to each operating condition, and perform model training on the switching transient waveform data to obtain complete switching transient estimation results under a single operating condition; the switching transient waveform data includes waveforms related to gate-source voltage, drain-source voltage, drain current, and losses; The graph structure construction module is used to map each double-pulse test working condition as a node in the graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes. The feature extraction and fusion module is used to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested working condition nodes using a multi-kernel convolutional fusion network, so as to obtain the feature representation of each node in the graph structure. The feature aggregation module is used to input the feature representation of the nodes into the graph attention network. It dynamically calculates the weights between nodes in the graph structure adjacency matrix through the attention mechanism in the graph attention network. It completes the update of node features through cross-node feature propagation and fusion. It also introduces a residual connection structure between the input and output of the graph attention network to avoid gradient vanishing during training. The transient waveform estimation module is used to output the switching transient waveform estimation results of SiCMOSFET under the target operating conditions based on the fused and updated node characteristics.
[0098] In an alternative embodiment, the device further includes a generalization module configured with the following functions:
[0099] Select room temperature The model is trained using double-pulse test data under the given conditions, and generalization constraints are performed using experimental data under other temperature conditions as supervision signals. This enables the trained multi-kernel convolutional graph attention model to estimate the switching transient dynamic characteristics of SiC MOSFET modules under different temperature conditions without the need for retraining.
[0100] For different SiC MOSFET devices of the same type or with similar rated parameters and package types, the diagram structure remains unchanged. Utilizing the room temperature of the target SiC MOSFET device Double pulse test dataset The trained model is fine-tuned to estimate the switching transient characteristics of the target device. This represents the set of nodes in the target device graph structure. This represents the set of edges in the target device graph structure. The diagram shows the structure corresponding to the target SiC MOSFET device.
[0101] It should be noted that the various functional modules in the embodiments of the present invention can be integrated into one processing module, or each unit can exist as a separate physical entity, or two or more units can be integrated into one module. The integrated module can be implemented in hardware or as a software functional module.
[0102] Finally, it should be noted that the above preferred embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail through the above preferred embodiments, those skilled in the art should understand that various changes can be made to it in form and detail without departing from the scope defined by the claims of the present invention.
Claims
1. A SiC MOSFET module switching transient modeling method based on graph structure embedding, characterized in that, The method, based on dual-pulse test data, models the transient dynamic behavior of switching under untested conditions using limited measured data, and obtains a multi-kernel convolutional graph attention model. Specifically, it includes the following steps: A dual-pulse test was performed on the SiC MOSFET module under different operating conditions to obtain the switching transient waveform data corresponding to each operating condition. The switching transient waveform data were then used to train a model to obtain a complete switching transient estimation result for a single operating condition. The switching transient waveform data included waveforms related to gate-source voltage, drain-source voltage, drain current, and losses. Each double-pulse test working condition is mapped to a node in a graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes. Using a multi-kernel convolutional fusion network, multi-scale feature extraction and weighted fusion are performed on the switching transient waveform data corresponding to the tested working condition nodes to obtain the feature representation of each node in the graph structure. The feature representation of the node is input into the graph attention network. The weights between nodes in the graph structure adjacency matrix are dynamically calculated through the attention mechanism in the graph attention network. The node features are updated through cross-node feature propagation and fusion. Furthermore, a residual connection structure is introduced between the input and output of the graph attention network to avoid gradient vanishing during training. Based on the updated node characteristics, the switching transient waveform estimation results of the SiC MOSFET under the target operating conditions are output.
2. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 1, characterized in that, The graph structure is represented as G=(P,E), where It is a set of N nodes, corresponding to N types of double-pulse test operating conditions, where E is the set of edges between nodes; it is a graph-structured adjacency matrix. Each element in , Representative node With nodes The connection weight between them is assigned a default value of 1. Represents node variables, Represents the space of real numbers with N rows and N columns.
3. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 2, characterized in that, The operating conditions for the dual-pulse test consist of one or more parameters, including DC bus voltage, load current, gate resistance, gate drive voltage, DC link stray inductance, and SiC MOSFET module operating temperature.
4. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 1, characterized in that, The multi-kernel convolutional fusion network includes three one-dimensional convolutional networks with kernel sizes of 3, 7, and 11. The small-scale convolutional kernel captures the local features of the rising and falling edges of the switching transient waveform, the medium-scale convolutional kernel captures the medium-scale features of the waveform shape, and the large-scale convolutional kernel captures the global features of the overall switching transient signal's trend and structure.
5. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 4, characterized in that, The method utilizes a multi-kernel convolutional fusion network to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested operating condition nodes, obtaining the feature representation of each node in the graph structure, including: The input transient waveform data of the switch is processed by position encoding and then input into one-dimensional convolutional networks. The convolutional output features are calculated through the convolutional layers. The output features of the convolutional layers of each one-dimensional convolutional network are input into the feedforward network for processing to obtain the final output feature representation of the corresponding one-dimensional convolutional neural network; the feedforward network contains two fully connected layers and uses ReLU as the activation function. The feature representations of each one-dimensional convolutional network are weighted, summed, and fused to obtain the fused node feature representation.
6. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 5, characterized in that, The output of each one-dimensional convolutional neural network layer Calculated in the following way: , Where k is the kernel size of the one-dimensional convolutional layer, and k takes the values 3, 7, and 11 for the three one-dimensional convolutional networks, t is the time step index of the temporal signal, s is the sliding index of the convolutional kernel, and X i For the i-th input, the switching transient waveform data, and These represent the stride and padding of a 1D convolutional layer, respectively. b represents the weights of each convolutional layer, and b represents the bias. The weight parameters corresponding to the classification heads will be used to control the output of the convolutional layer. The output feature representation of the corresponding one-dimensional convolutional neural network is obtained by processing through a feedforward network. The feedforward network contains two fully connected (FC) layers: , in, It is a linear rectified activation function. This is the weight matrix of the fully connected layer in the feedforward network. FFN represents the bias term of the fully connected layer in the feedforward network. Three convolutional neural networks Merge according to the following strategies: , in, This represents the weight of the j-th convolutional network module, where j=1,2,3 correspond to one-dimensional convolutional networks with kernel sizes of 3, 7, and 11, respectively. The fused feature representation of the double-pulse test sequence will be used as the update input of the double-pulse test-based module. i represents the index of the input sample, Fusion represents the weight coefficient of fusing the output features of different convolutional modules, Fused represents the fused feature, and CNN represents the convolutional neural network.
7. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 1, characterized in that, The weights between nodes in the adjacency matrix of a graph structure are dynamically calculated using the attention mechanism in a graph attention network. Node features are updated through cross-node feature propagation and fusion, including: The node features are expanded in dimension by sharing a parameter matrix, and the original attention coefficients between nodes are calculated using a single-layer feedforward neural network. The original attention coefficients are normalized using the softmax function to obtain the normalized attention coefficients between nodes; A multi-head attention mechanism is adopted, which aggregates the features of neighboring nodes based on normalized attention coefficients to obtain the updated feature representation of the nodes, thereby improving the stability of the model learning process.
8. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 7, characterized in that, The correlation between nodes i is obtained through the neighboring nodes in the graph attention layer. The attention coefficient between them is used to represent: , in, Let be the attention coefficient between node i and its neighboring node j. It is a shared parameter matrix used to augment the dimensions of node features. Indicates line F The column space of real-valued matrices; F is the number of features for each node; It is a new set of node features; It is a set of node features. express In a 3D real vector space, AM is implemented by a single-layer feedforward neural network with a weight vector of... , Indicates 2 A 3D real vector space, with LeakyReLU as the activation function; The attention coefficient is calculated using the softmax function for all neighboring node j values. Normalize: , in, For vectors transpose, Indicates the join operation. It is the set of neighboring nodes of node i. One of the nodes, based on normalized attention coefficients , For the first The original feature vectors of each node in the dual-pulse test condition. For nodes The original characteristics; For the first The original feature vectors of each double-pulse test condition node; Feature aggregation : , in, The activation function applied; Use a multi-head attention mechanism to compute the output feature representation: , in, The number of independent attention mechanisms, for The k-th normalized attention coefficient, This is the weight matrix for the corresponding input linear transformation.
9. The method for transient switching modeling of SiC MOSFET modules based on graph structure embedding according to claim 1, characterized in that, The method also includes a model generalization step: Select room temperature The model is trained using double-pulse test data under the given conditions, and generalization constraints are performed using experimental data under other temperature conditions as supervision signals. This enables the trained multi-kernel convolutional graph attention model to estimate the switching transient dynamic characteristics of SiC MOSFET modules under different temperature conditions without the need for retraining. For different SiC MOSFET devices of the same type or with similar rated parameters and package types, the diagram structure remains unchanged. Utilizing the room temperature of the target SiC MOSFET device Double pulse test dataset The trained model is fine-tuned to estimate the switching transient characteristics of the target device. This represents the set of nodes in the target device graph structure. This represents the set of edges in the target device graph structure. The diagram shows the structure corresponding to the target SiC MOSFET device.
10. A transient modeling device for SiC MOSFET module switching based on graph structure embedding, characterized in that, The device includes: The data acquisition module is used to perform dual-pulse testing on the SiC MOSFET module under different operating conditions, acquire switching transient waveform data corresponding to each operating condition, and perform model training on the switching transient waveform data to obtain complete switching transient estimation results under a single operating condition; the switching transient waveform data includes waveforms related to gate-source voltage, drain-source voltage, drain current, and losses; The graph structure construction module is used to map each double-pulse test working condition as a node in the graph structure. The correlation of waveform data between different working conditions is represented by the edges between nodes, forming a graph structure containing multiple nodes. The feature extraction and fusion module is used to perform multi-scale feature extraction and weighted fusion on the switching transient waveform data corresponding to the tested working condition nodes using a multi-kernel convolutional fusion network, so as to obtain the feature representation of each node in the graph structure. The feature aggregation module is used to input the feature representation of the nodes into the graph attention network. It dynamically calculates the weights between nodes in the graph structure adjacency matrix through the attention mechanism in the graph attention network. It completes the update of node features through cross-node feature propagation and fusion. It also introduces a residual connection structure between the input and output of the graph attention network to avoid gradient vanishing during training. The transient waveform estimation module is used to output the switching transient waveform estimation results of SiCMOSFET under the target operating conditions based on the fused and updated node characteristics.