A circuit breaker fault diagnosis method based on adaptive HEBKA optimization hybrid CNN-SE-BiLSTM network
By using an adaptive HEBKA-optimized hybrid CNN-SE-BiLSTM network, which combines multi-source time-series signals and Gram angle difference field coding, the problems of single information and poor feature extraction adaptability in circuit breaker fault diagnosis are solved, and high-precision and high-reliability circuit breaker condition monitoring is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- STATE GRID SHANGHAI MUNICIPAL ELECTRIC POWER CO
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-19
AI Technical Summary
Existing circuit breaker fault diagnosis technologies suffer from problems such as limited information, poor adaptability of feature extraction, simple model architecture, and low optimization efficiency. These technologies fail to fully reflect the complex fault characteristics of circuit breakers, leading to missed or incorrect diagnoses.
A hybrid CNN-SE-BiLSTM network based on adaptive HEBKA optimization is adopted. By synchronously acquiring multi-source time-series signals, multi-channel feature maps are generated using Gram angle difference field coding. The CNN-SE and BiLSTM networks are combined for feature extraction and time-series analysis to achieve high-precision diagnosis of circuit breaker status.
It achieves high-precision and high-reliability diagnosis of circuit breaker faults, can dynamically enhance relevant signal characteristics, suppress interference, improve the accuracy and robustness of diagnosis, and adapt to different operating conditions.
Smart Images

Figure CN122241475A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of electrical equipment condition monitoring and intelligent fault diagnosis technology, and in particular to a fault diagnosis method for circuit breakers based on adaptive HEBKA optimization and a hybrid CNN-SE-BiLSTM network. Background Technology
[0002] Circuit breakers are critical protective devices in power systems used for closing, carrying, and interrupting current; their reliability directly affects power grid safety and power supply quality. Currently, fault diagnosis of circuit breakers mainly relies on the following technical solutions: One common approach is the traditional method based on a single signal threshold alarm, which is currently the most prevalent solution in engineering. For example, monitoring the current waveform of the opening and closing coils and analyzing its peak value, time, and other characteristic parameters to determine if they exceed a preset threshold helps identify faults such as mechanical jamming. Another common approach is to monitor vibration signals and analyze vibration energy, peak frequency, and other characteristics for judgment. The disadvantages of these methods are: relying on only a single dimension of information, they cannot comprehensively reflect the complex fault characteristics of the circuit breaker across multiple physical fields, easily leading to missed or false alarms; threshold setting depends on expert experience, resulting in poor adaptability and insensitivity to gradual or novel faults.
[0003] There are also methods based on traditional machine learning: some studies attempt to extract multidimensional time-domain and frequency-domain features (such as mean, variance, wavelet packet energy, etc.) from vibration and current signals, and input them into classifiers such as support vector machines (SVM) and random forests for fault identification. The disadvantages of these methods are: feature extraction still relies on manual design, the process is cumbersome and requires a high level of expertise, and it is difficult to capture deep-seated, nonlinear fault features; the model's generalization ability decreases significantly under changing operating conditions or strong noise interference, and its robustness is insufficient.
[0004] In recent years, some research has introduced deep learning into equipment diagnostics. For example, one-dimensional vibration signals are directly input into convolutional neural networks (CNNs) for end-to-end fault classification. Other research uses long short-term memory networks (LSTMs) to process current time series. The drawbacks of these single-architecture deep learning methods are: the model architecture is singular; CNNs excel at spatial features but are weak in temporal modeling, while LSTMs excel at temporal features but have limited spatial feature extraction capabilities, making it difficult to simultaneously capture key spatiotemporal fault modes in circuit breaker operation; the input signals are usually of a single type, failing to effectively integrate multi-source heterogeneous information; and the network hyperparameters (such as the number of layers, filters, and learning rate) are highly dependent on the researcher's experience and extensive trial and error, resulting in low tuning efficiency and difficulty in obtaining a globally optimal solution, affecting the model's performance ceiling and stability.
[0005] In summary, existing technical solutions have significant shortcomings in terms of the comprehensiveness of diagnostic information, the adaptability of feature extraction, the synergy of model architecture, and the economy and convenience of engineering deployment. Summary of the Invention
[0006] Based on the above analysis, the embodiments of the present invention aim to provide a circuit breaker fault diagnosis method based on adaptive HEBKA optimization and hybrid CNN-SE-BiLSTM network, in order to solve the problems of single circuit breaker diagnostic information and poor feature extraction adaptability in the prior art.
[0007] The objective of this invention is mainly achieved through the following technical solutions: This invention provides a circuit breaker condition diagnosis method based on improved decision fusion of current and travel information, comprising the following steps: Simultaneously acquire multiple current signals and vibration signals during the opening and closing process of the circuit breaker under test to obtain multi-source timing signals; Gram difference field coding is performed on each time series signal to convert the corresponding time series signal into a single-channel two-dimensional image, which is then spliced and fused to obtain a multi-channel feature map; The multi-channel feature map is input into a hybrid diagnostic network trained on a CNN-SE-BiLSTM network to predict the circuit breaker's state category, thereby obtaining the fault diagnosis result of the circuit breaker under test.
[0008] Furthermore, the hybrid diagnostic network includes a spatial feature extraction module, a spatiotemporal feature extraction module, and a prediction module. The spatial feature extraction module, trained based on a CNN-SE architecture, is used to capture enhanced local features of the multi-channel feature map. The spatiotemporal feature extraction module, trained based on a Bidirectional Long Short-Term Memory (BiLSTM) network, is used to perform opening and closing timing feature analysis on the enhanced local features to obtain spatiotemporal comprehensive features. The prediction module maps the spatiotemporal comprehensive features to a state category space, and uses an activation function to output the probability distribution of each state category. The state category with the highest probability is the fault diagnosis result of the circuit breaker under test.
[0009] Furthermore, the multi-channel feature map is obtained based on the following process: The time-series signals are normalized to obtain several normalized sampling point data. Using the sampled data as the cosine angle value and the corresponding sampling timestamp as the radius, the sampled data is transformed into polar coordinates to obtain the polar angle corresponding to each sampled data. Based on the trigonometric difference function of the polar angle between each sampling point data, a Gram angle difference field matrix is constructed to obtain each single-channel two-dimensional image; wherein, the height dimension and width dimension of the single-channel two-dimensional image together represent the Gram angle difference between sampling point data at different times; All single-channel two-dimensional images are stitched together along the channel dimension to obtain the multi-channel feature map.
[0010] Furthermore, the spatial feature extraction module includes multiple cascaded convolutional-SE modules; wherein each convolutional processing module includes a convolutional layer, an SE attention layer, and a max pooling layer. Each convolutional layer uses convolutional kernels of different sizes to extract features from the input feature map, uses SE attention to obtain the dependencies between different channels, obtains channel weights that reflect the importance of each channel, and multiplies the channel weights with the input feature map channel by channel to obtain the local features output by the current convolutional processing module. The output of the last convolutional processing module is the enhanced local feature.
[0011] Furthermore, the spatiotemporal integrated features are obtained based on the following process: The enhanced local features are sliced along the height dimension to obtain the feature vector at each time. The feature vectors at all times are spliced and reconstructed to obtain the feature sequence of the circuit breaker under test during the reclosing process. The spatiotemporal feature extraction module is used to analyze the dynamic temporal relationship between the feature sequences to obtain the spatiotemporal comprehensive features.
[0012] Furthermore, the hybrid diagnostic network is trained based on the following process: The training set is obtained by labeling the state types of the multi-source time-series signals collected in history; Gram difference field coding is performed on each time-series signal in the training set to obtain the corresponding single-channel two-dimensional image, which is then spliced and fused to obtain a multi-channel feature map; Using the multi-channel feature map as input and each state type as output, a hybrid CNN-SE-BiLSTM network is trained based on a hybrid loss function, and the network hyperparameters are optimized by adaptive HEBKA to obtain the hybrid diagnostic network.
[0013] Furthermore, the state categories include: normal state, trip coil jamming, closing coil jamming, energy storage motor failure, left mechanical transmission jamming, right mechanical transmission jamming, contact erosion, insulation medium deterioration, and control circuit abnormality.
[0014] Furthermore, the multi-channel current signals include the trip coil current, the closing coil current, the energy storage motor current, the lockout coil current, and the total current of the secondary control circuit.
[0015] Furthermore, the hybrid loss function is: , in, For a mixed loss function, These are adjustable weighting coefficients. For weighted cross-entropy loss, This is the loss for multi-source signal cooperative regularization.
[0016] Furthermore, the network hyperparameters include the learning rate and the number of convolutional kernels.
[0017] Compared with the prior art, the present invention can achieve at least one of the following beneficial effects: 1. This invention proposes a circuit breaker fault diagnosis method based on adaptive HEBKA optimization using a hybrid CNN-SE-BiLSTM network. By using a hybrid architecture of CNN and BiLSTM to collaboratively capture typical patterns of circuit breaker faults, and combining the SE attention mechanism, adaptive recalibration of multi-source signal channels is achieved. It can dynamically enhance the features of the most relevant signal channels according to the specific fault type (such as "energy storage motor fault" or "single-sided mechanical jamming"), suppress irrelevant interference, and finally achieve high-precision and high-reliability diagnosis of circuit breaker status.
[0018] 2. The system integrates five key electrical circuit current signals (opening, closing, energy storage motor, interlocking coil, and main control circuit) with two symmetrically arranged mechanical vibration signals at low cost, achieving multi-physics synchronous sensing of the circuit breaker's "control-energy storage-operation-transmission" process. This fusion strategy enables the diagnostic system to comprehensively utilize electrical timing characteristics and mechanical impact characteristics, providing more comprehensive evidence to differentiate similar faults, such as mechanical jamming and insufficient control voltage, significantly improving the reliability of diagnostic conclusions.
[0019] 3. The Hybrid Enhanced Blackwing Kite Optimization Algorithm (HEBKA) is adopted for intelligent hyperparameter optimization. Its core advantages are deeply aligned with the needs of circuit breaker diagnosis: directional search balance: the three-stage attack mechanism of the Blackwing Kite ensures that the optimization process can not only comprehensively scan the hyperparameter space to find potential architectures (to deal with complex fault modes), but also finely tune key parameters to improve sensitivity to weak features and noise resistance.
[0020] In this invention, the above-described technical solutions can be combined with each other to achieve more preferred combinations. Other features and advantages of this invention will be set forth in the following description, and some advantages may become apparent from the description or be learned by practicing the invention. The objects and other advantages of this invention can be realized and obtained from what is particularly pointed out in the description and drawings. Attached Figure Description
[0021] The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Throughout the drawings, the same reference numerals denote the same parts.
[0022] Figure 1 This is a flowchart of a circuit breaker fault diagnosis method based on adaptive HEBKA optimization using a hybrid CNN-SE-BiLSTM network, as described in an embodiment of the present invention. Figure 2This is a flowchart illustrating the training process of the CNN-SE-BiLSTM hybrid diagnostic network in an embodiment of the present invention. Figure 3 The diagram below is a schematic representation of the CNN-SE-BiLSTM hybrid diagnostic network in an embodiment of the present invention. Figure 4 This is a schematic diagram of the SE attention module in an embodiment of the present invention. Detailed Implementation
[0023] Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form part of this application and are used together with the embodiments of the present invention to illustrate the principles of the present invention, but are not intended to limit the scope of the present invention.
[0024] Example 1 A specific embodiment of the present invention discloses a circuit breaker fault diagnosis method based on adaptive HEBKA optimization using a hybrid CNN-SE-BiLSTM network, such as... Figure 1 As shown, it includes the following steps: Step S1: Synchronously acquire multiple current signals and vibration signals during the opening and closing process of the circuit breaker under test to obtain multi-source timing signals; Step S2: Perform Gram difference field coding on each time series signal, convert the corresponding time series signal into a single-channel two-dimensional image, and then stitch and fuse them to obtain a multi-channel feature map; Step S3: Input the multi-channel feature map into the hybrid diagnostic network trained based on the CNN-SE-BiLSTM network to predict the circuit breaker state category and obtain the fault diagnosis result of the circuit breaker under test; wherein, the hyperparameters of the hybrid diagnostic network are obtained by adaptive HEBKA optimization.
[0025] Using the above method, multi-source time-series signals are transformed into two-dimensional feature maps through Gram angle difference field coding. A hybrid diagnostic network trained based on CNN-SE-BiLSTM network is used to predict the circuit breaker status category. This not only extracts local morphological anomalies of effective characterizing signals, but also fully describes the complete stage process and dynamic correlation of opening and closing actions, ultimately achieving high-precision and high-reliability diagnosis of circuit breaker status.
[0026] Specifically, in step S1, during the critical period before and after the opening and closing operation commands of the circuit breaker under test are issued, multiple signals, including the current signal and vibration signal of the circuit breaker, are simultaneously acquired by the synchronous acquisition device to obtain the multi-source timing signal of the circuit breaker.
[0027] The collected current signals include the circuit breaker trip coil current, circuit breaker closing coil current, circuit breaker energy storage motor current, circuit breaker locking coil current, and the total current of the circuit breaker secondary control circuit. The collected vibration signals include two vibration signals from both sides of the circuit breaker housing, collected using vibration sensors installed on both sides of the circuit breaker housing.
[0028] Specifically, in step S2, the time-series signals of the above 5 current channels and 2 vibration channels are encoded respectively, and multi-channel feature maps are extracted. The specific steps are as follows: S21. Perform numerical scaling on each time-series signal to obtain several normalized sampling point data; S22. Using the sampling point data as the cosine angle value and the corresponding sampling timestamp as the radius, transform the sampling point data to the polar coordinate system to obtain the polar angle corresponding to each sampling point data. S23. Construct using the trigonometric difference function relationship of the polar angles between the data of each sampling point of the time-series signal. The Gram difference field matrix is used to obtain a single-channel two-dimensional image of each time-series signal, i.e., a GADF image. The height and width dimensions of the single-channel two-dimensional image together represent the Gram difference between sampling points at different times. S24. The GADF images generated from the seven signals of circuit breaker current and vibration are stitched together in the channel dimension C to form the final seven-channel fused two-dimensional multi-channel feature map.
[0029] For example, the timing signal sampled from any one channel x Normalized to [-1,1], the normalized sampling point data is obtained based on the following formula. : (1) In the formula, x i ( i=1,2,...,N ) represents the data of the i-th sampling point in a time-series signal x, and N is the length of the time-domain signal, i.e., the number of sampling points.
[0030] Will Transform to polar coordinates, encode its value as a cosine angle, and timestamp it. t i As radius r i Represented in polar coordinates, it is shown as: (2) In the formula, t i The sampling point number of the time-series signal. r i Polar radius in polar coordinates; This is the polar angle in polar coordinates.
[0031] As the number of sampling points increases, the signal will generate different angles and radii in the polar coordinate system. As can be seen from equations (1) and (2), for the normalized interval, the polar coordinate system has a unique mapping relationship with it; at the same time, the normalized time domain data still maintains absolute time correlation in the polar coordinate system through different angle boundaries, thus providing rich state parameter information of the monitoring equipment.
[0032] The Gram difference field is defined as follows: (3) In the formula, the subscript i , j These are different sampling point sequence numbers. As shown in equation (3), using the Gram angle difference field as the element of the Gram matrix, each element maintains a strong correlation with the time series, thus forming a single-channel two-dimensional image with temporal texture characteristics. This encoding method not only represents the time-domain data as a two-dimensional image, but also preserves the complete parametric information of the original signal, including its temporal dynamic characteristics.
[0033] The generated GADF images are stitched together along the channel dimension C to form a two-dimensional multi-channel feature map tensor, which is represented as: I∈R H×W×C Where C=7, H, and W equal N. This multi-channel feature map is used as the input to the subsequent hybrid diagnostic network, carrying the optimized spatiotemporal features of the circuit breaker's five key electrical circuits and two key mechanical measurement points.
[0034] Specifically, in step S3, the hybrid diagnostic network trained based on the CNN-SE-BiLSTM network of this invention includes a spatial feature extraction module, a spatiotemporal feature extraction module, and a prediction module. Among them, such as... Figure 3 As shown, the spatial feature extraction module is trained based on the CNN-SE architecture; the spatiotemporal feature extraction module is trained based on the Bidirectional Long Short-Term Memory (BiLSTM) network.
[0035] First, the spatial feature extraction module consists of multiple convolutional processing modules connected in series. In response to the spatiotemporal correlation texture characteristics of GADF images and the need for circuit breaker fault diagnosis to capture features at multiple scales from local anomalies to global action patterns, a progressive design with wide to deep, multi-scale receptive fields is adopted to capture enhanced local features of the multi-channel feature map.
[0036] The spatial feature extraction module comprises at least three convolutional processing layers. The first layer is a primary feature extraction layer with a large receptive field, specifically designed to capture fault temporal patterns in the GADF image that may span multiple time points, such as slow mechanical jamming processes or prolonged arc currents. Its wide receptive field covers the entire time span of the fault event. The second layer is a medium-scale abstraction layer, employing medium-scale convolutional kernels to focus on medium-scale fault regions, such as the local response of vibration and shock. The third layer is a high-scale abstraction layer, using small convolutional kernels to finely extract detailed fault textures, such as subtle distortions in current waveforms or harmonic components of vibration signals. This multi-scale design ensures that the network possesses sensitive feature extraction capabilities for various types of circuit breaker faults (from gradual to abrupt changes, from global to local).
[0037] Furthermore, in order to be applicable to the multi-source signal fusion diagnosis scenario of 5 electrical + 2 vibration of circuit breakers, an SE attention module is introduced into the spatial feature extraction module. Therefore, each convolution processing module includes a convolutional layer, an SE attention module and a max pooling layer, forming an optimized process of primary feature extraction → first channel calibration → intermediate feature extraction → second channel calibration → deep feature extraction → final channel calibration. The SE attention module is used to screen key signal channels and focus on the most critical feature combinations to finely adjust the feature response.
[0038] For example, the structure of the first convolutional processing module is designed as follows: Convolutional layer (Conv2D(7x7) → BatchNorm → ReLU activation) → SE attention (i.e., SE Block) → Max pooling layer, with 32 convolutional kernels and a kernel size of 7x7. In GADF images, circuit breaker fault events manifest as a specific texture pattern extending over time. A large initial receptive field (7x7) can capture primary correlation patterns with a large time span in the first layer of the network. For circuit breaker diagnosis, this helps to identify gradual fault patterns involving multiple time points, such as "slow rising edge of tripping current," providing rich contextual information for subsequent layers. 32 filters are formed by 32 convolutional kernels to extract basic texture feature maps of GADF images, such as edges and corners, avoiding excessive computational overhead. The first SE attention module is introduced at the beginning of feature extraction, allowing the network to adaptively enhance or suppress the response of certain feature channels according to the potential fault type of the current sample. Initial downsampling is performed using max pooling (MaxPool(2x2)), which halves the feature map size. This reduces the data dimensionality while preserving the most significant features, and introduces translation invariance.
[0039] The second convolutional processing module has the following structure: Conv2D(5x5) → BatchNorm → ReLU → SE Block → MaxPool(2x2), where the number of convolutional kernels is 64, the kernel size is 5x5, and 2×2 max pooling is used. On the feature map after initial downsampling, medium-sized convolutional kernels are used to focus on more specific fault characterization areas, such as local bright areas corresponding to vibration impacts during circuit breaker opening and closing, or specific stripes during current stabilization. The number of filters is increased to 64 to learn more diverse and complex feature combinations, adapting to the feature diversity of various fault types such as mechanical, electrical, and insulation faults in circuit breakers. After one layer of abstraction, the feature map already contains preliminary information from different current and vibration signal channels. The embedded second SE attention module can automatically evaluate and enhance the feature responses of the signal channels most relevant to the current potential fault, achieving early information filtering. For example, if it is learned that the "mechanical jamming" fault is mainly related to vibration signals, the SE module will automatically increase the weights of the two vibration signal channels.
[0040] The third convolutional processing module has the following structure: Conv2D(3×3) → BatchNorm → ReLU → Conv2D(3×3) → BatchNorm → ReLU → SE Block → MaxPool(2×2), where there are 128 convolutional kernels and the kernel size is 3x3 (two cascaded 3x3 convolutions). Using two cascaded 3x3 convolutions provides an effective receptive field comparable to a single 5x5 convolution, but with fewer parameters and stronger nonlinearity (two ReLU activations), enabling more refined extraction of high-level features and better adaptation to complex nonlinear temporal correlations in GADF images. The number of filters increases to 128. At the deeper layers of the network, sufficient capacity is needed to encode high-level representations of various complex fault modes. 128 filters can form rich feature combinations, supporting multi-class fault classification tasks for circuit breakers. The embedded third SE attention module performs channel calibration on the highest-level features, allowing the network to refocus on the most critical feature channels before making the final decision, greatly improving the discriminative power of the features. Especially when fault characteristics are weak or interference is present, the SE module can effectively suppress noise channels. The final 2×2 max pooling compresses the feature map size to a dimension suitable for conversion into a sequence.
[0041] It's important to note that all convolutional layers use a stride of 1 and "same padding" to ensure no edge information is lost during convolution. This is particularly crucial for GADF images, as image edges may contain critical fault information indicating the start or end of a signal. Pooling layers use a stride of 2 and "valid padding" to progressively compress the feature map size, improving computational efficiency while preserving the most salient features and preparing appropriate dimensions for subsequent serialization. The number of filters in the convolutional layers gradually increases from 32 to 128, following the principle of increasing feature abstraction level: shallow layers extract basic texture features (32 filters), mid-layers extract combined features (64 filters), and deep layers extract high-level semantic features (128 filters). This design ensures sufficient feature extraction while avoiding the risk of overfitting due to parameter redundancy.
[0042] Furthermore, to optimize the accuracy and robustness of circuit breaker fault diagnosis, an SE attention module is introduced to adaptively calibrate the importance of characteristic channels, such as... Figure 4 As shown, any SE attention module performs the following operations to calibrate the importance of the feature channels: S311. Use SE attention to obtain the dependencies between different channels and obtain the channel weights that reflect the importance of each channel. Specifically, the input feature map is first subjected to global average pooling, compressing the two-dimensional features of each channel into a scalar, thereby obtaining a global statistical description of each signal channel throughout the circuit breaker's operation. This operation enables a comprehensive evaluation of the overall activation level of each electrical circuit and mechanical measuring point channel, providing a basis for subsequent channel importance assessment. Then, two fully connected layers (using ReLU in the middle layer and Sigmoid in the output layer) are used to learn the nonlinear dependencies between channels, generating a weight value between 0 and 1 for each channel—the channel weight. In circuit breaker fault diagnosis, different fault types (such as mechanical jamming, contact erosion, insulation abnormalities, etc.) show significantly different correlations with the features of different signal channels. For example, mechanical faults are more dependent on vibration signal channels, while coil abnormalities are more sensitive to the corresponding current channels. This step automatically learns this correlation and assigns appropriate importance weights to each channel.
[0043] S312. The learned channel weights are multiplied channel by channel with the input feature map to automatically enhance the feature channels most relevant to the current diagnostic task and suppress irrelevant or noise-affected channels, thus obtaining the local features output by the current convolutional processing module. The output of the last convolutional processing module is the enhanced local feature.
[0044] The SE (Self-Enhancing) mechanism enables the network to dynamically focus on key fault information: for example, when diagnosing "trip coil jamming," it automatically enhances the characteristic response of the trip coil current channel; when diagnosing "mechanical transmission imbalance," it may simultaneously enhance the vibration signal channels on both sides and compare their differences. Through this adaptive feature optimization, the network can more accurately capture the subtle features of specific faults in the circuit breaker, improving the sensitivity and specificity of the diagnosis.
[0045] Then, the spatiotemporal feature extraction module performs opening and closing timing feature analysis on the enhanced local features to obtain the spatiotemporal comprehensive features. The specific process is as follows: S321. Slice the enhanced local features along the height dimension to obtain the feature vector at each time. Concatenate and reconstruct the feature vectors at all times to obtain the feature sequence during the reclosing process of the circuit breaker under test. S322. Analyze the dynamic temporal relationship between the feature sequences using the spatiotemporal feature extraction module to obtain the spatiotemporal comprehensive features.
[0046] By using two independent forward and backward LSTM layers after training, the system learns sequence dependencies simultaneously from both the "start" and "end" directions of the action. This enables it to accurately model the dynamic relationships and causal logic of a series of strictly sequential stages during the opening and closing process of a circuit breaker: "coil excitation → core start-up → mechanism transmission → contact separation / closing → arc generation / extinguishing". For example, the network can learn to identify complex spatiotemporal coupled fault modes such as abnormal current duration caused by early mechanical jamming, or premature current crossing to zero caused by contact pre-breakdown.
[0047] Finally, the spatiotemporal comprehensive features output by the spatiotemporal feature extraction module at the last time step are mapped to a specific fault category space through a fully connected layer using the prediction module. These spatiotemporal comprehensive features deeply integrate the spatiotemporal context information of the entire circuit breaker operation process. The Softmax function outputs the probability distribution corresponding to different circuit breaker fault types, and the state category with the highest probability is the fault diagnosis result of the circuit breaker under test. The parameter table of the hybrid diagnostic network structure of this invention is shown in Table 1.
[0048] Table 1
[0049] like Figure 2 The present invention trains the hybrid diagnostic network based on the following process: S331. The multi-source time-series signals acquired historically are labeled with state types to obtain a training set. The state categories are specifically set for circuit breakers, including normal state, trip coil jamming, closing coil jamming, energy storage motor failure, mechanical transmission jamming (left / right side), contact erosion, insulation medium degradation, and control circuit abnormalities, thereby achieving high-precision automated fault classification and state assessment. 。
[0050] S332. Gram difference field coding is performed on each time-series signal in the training set to obtain the corresponding single-channel two-dimensional image, which is then spliced and fused to obtain a multi-channel feature map. S333. Using the multi-channel feature map as input and each state type as output, train a hybrid CNN-SE-BiLSTM network based on a hybrid loss function, and optimize the network hyperparameters using adaptive HEBKA to obtain the hybrid diagnostic network.
[0051] On the one hand, the hybrid loss function is composed of three core parts working together, defined as follows: , in, For a mixed loss function, These are adjustable weighting coefficients used to balance the importance of various losses. For weighted cross-entropy loss, This is the loss for multi-source signal cooperative regularization.
[0052] The weighted cross-entropy loss, used to directly optimize the accuracy of fault classification, is expressed as: , in, N The number of samples in a batch. K This represents the total number of fault categories. For the first i The true labels of each sample (using one-hot encoding) The output of the model's Softmax layer, the first i The sample belongs to the first k Predicted probability of class For the first k Weighting coefficients for different types of faults.
[0053] The number of samples in the normal state of a circuit breaker is typically far greater than the number of samples in various fault categories. Directly using standard cross-entropy can cause the model to be biased towards the "normal" category. This can be addressed by setting higher cross-entropy values for a few fault categories. This can force the model to give equal importance to these high-risk but rare faults.
[0054] The multi-source signal collaborative regularization loss is a regularization term based on cosine similarity. Utilizing the channel weight information of the SE attention module, the network is explicitly guided to learn the physical correlation and collaborative mechanism between multiple source signals such as current and vibration at the loss function level, enhancing the fusion and robustness of diagnostic decisions. This is expressed as: , , in, ∈R T For the first i For each sample, extract the attention weight vectors of the corresponding 5 current signal channels at all time steps from the output of the last SE attention module; ∈R T For the first i For each sample, extract the attention weight vector corresponding to the two vibration signal channels; T The dimension of the weight vector (e.g., the number of time steps or the number of spatial locations in the feature map).
[0055] On the other hand, to efficiently optimize the hyperparameters (such as learning rate, batch size, number of filters in each convolutional layer, number of LSTM units, etc.) of the complex CNN-SE-BiLSTM network for circuit breaker diagnosis, and to ensure that the model has optimal performance and strong generalization ability under varying operating conditions, this invention adopts the Hybrid Enhanced Black-winged Kite Algorithm (HEBKA). This algorithm is an advanced metaheuristic optimization method formed through three core improvements based on the original Black-winged Kite Algorithm (BKA). Its improvement strategy is specifically designed for complex optimization problems involving high dimensions, multiple peaks, and nonlinearity, and is extremely suitable for the characteristics of the hyperparameter space of circuit breaker diagnosis models. The optimization process is as follows: The hyperparameters to be optimized in the CNN-SE-BiLSTM network—learning rate, number of convolutional kernels, number of LSTM units, etc.—are defined as a D-dimensional optimization problem. The population size N and the maximum number of iterations T are set, and the population positions are randomly initialized according to reasonable ranges for each hyperparameter. X i ( i =1,2,..., N ).
[0056] For each individual in the population (i.e., each set of hyperparameter configurations) X iA corresponding CNN-SE-BiLSTM network model is constructed, trained on the circuit breaker fault diagnosis training set, and its diagnostic performance is evaluated on an independent validation set. The quality of this set of hyperparameters is quantified using a fitness function. This invention uses the comprehensive diagnostic error rate on the validation set as the fitness value, and the optimization objective is to minimize the fitness value.
[0057] For example, firstly, an initial population is generated according to a mixture loss function. During the population initialization phase, the Black-winged Kite Optimization Algorithm uses a stochastic strategy to generate initial position solutions: , in, X i For the first i The location of the black-winged kite, BK lb , BK ub Representing the first i The lower and upper boundaries of the Black-winged Kite's location. rand It is a random number between (0, 1).
[0058] The attack conversion factor is calculated based on the following formula. p: , in, a This is a control parameter, and its value is 0.4. r A random number between (0, 1). p If the value is ≤0.3, the black-winged kite is in the high-altitude soaring stage, and its position is updated according to formula (4); if 0.3 < p If the value is ≤0.6, the black-winged kite is in the low-altitude soaring stage, and its position is updated according to formula (5); when 0.6 < p When the position is less than 1, the black-winged kite will swoop down to catch its prey. At this time, the position is updated according to formula (6).
[0059] (4) , in, β As the default constant, set it to 1.5.
[0060] , (5) Where, x( ), y( () represents the direction coordinate.
[0061] (6) , in, α andG These represent the acceleration factor and the gravity factor, respectively.
[0062] Then, the pheromone of each individual is calculated. If it is less than 0.3, the individual is considered a poor individual and its position is updated according to formula (7); other individuals continue to migrate.
[0063] , (7) in, These are different random individuals in the population. It is a random binary number.
[0064] Finally, by perturbing the optimal individual, i.e., if rand If the value is greater than 0.5, the population is considered to be in a clustered state. An orthogonal trial-quasi-reflection strategy is then used to perturb the optimal individual to prevent the algorithm from getting trapped in local optima. The better individual between the optimal individual and its reflecting individual is selected as the final optimal individual. If the algorithm reaches the maximum number of iterations, the iteration terminates and the optimal solution is output; otherwise, the iteration continues.
[0065] The optimal hyperparameter combination obtained using HEBKA optimization. X best The CNN-SE-BiLSTM network was retrained on a complete training set of multi-source fault samples from circuit breakers to obtain the final diagnostic model with optimal performance. This model was then deployed on a substation edge computing device or a local server. During online diagnosis, the system synchronously collects multiple current and vibration signals from real-time circuit breaker operation. After preprocessing, GADF encoding, and fusion as in the training phase, these signals are input into the loaded diagnostic model. The model can then automatically output the fault type identification result and corresponding probability, realizing real-time intelligent monitoring of the circuit breaker status and early fault warning.
[0066] Through the aforementioned optimization process, premature convergence to a suboptimal solution that overfits specific data is effectively prevented, ensuring the final diagnostic model's superior generalization ability when facing equipment from different manufacturers, aging conditions, and complex electromagnetic and mechanical interference in the field. The orthogonal experiment-quasi-reflection perturbation strategy systematically explores the synergistic effects between hyperparameters, helping the algorithm escape local optima and find the globally optimal hyperparameter solution that achieves the best match and synergy between CNN spatial feature extraction, SE channel attention focusing, and BiLSTM temporal modeling capabilities. This ultimately improves the overall accuracy, reliability, and stability of the diagnostic system.
[0067] Compared with existing technologies, this embodiment provides a circuit breaker fault diagnosis method based on adaptive HEBKA optimization and a hybrid CNN-SE-BiLSTM network. The GADF image spatial features extracted by CNN can effectively characterize local morphological anomalies, current waveform distortion points, vibration and shock envelopes, etc., while the temporal features modeled by BiLSTM can completely describe the complete stage process and dynamic correlation of opening and closing actions. Thus, even in complex field noise backgrounds, it can still extract fault features with high discriminative power, greatly enhancing the model's anti-interference ability and early fault identification sensitivity, and ultimately improving the accuracy, reliability, and engineering practicality of circuit breaker condition diagnosis.
[0068] Those skilled in the art will understand that all or part of the processes of the methods described in the above embodiments can be implemented by a computer program instructing related hardware, and the program can be stored in a computer-readable storage medium. The computer-readable storage medium may be a disk, optical disk, read-only memory, or random access memory, etc.
[0069] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the scope of protection of the present invention.
Claims
1. A circuit breaker fault diagnosis method based on adaptive HEBKA optimization using a hybrid CNN-SE-BiLSTM network, characterized in that, Includes the following steps: Simultaneously acquire multiple current signals and vibration signals during the opening and closing process of the circuit breaker under test to obtain multi-source timing signals; Gram difference field coding is performed on each time series signal to convert the corresponding time series signal into a single-channel two-dimensional image, which is then spliced and fused to obtain a multi-channel feature map; The multi-channel feature map is input into a hybrid diagnostic network trained on a CNN-SE-BiLSTM network to predict the circuit breaker's state category, thereby obtaining the fault diagnosis result of the circuit breaker under test.
2. The method according to claim 1, characterized in that, The hybrid diagnostic network includes a spatial feature extraction module, a spatiotemporal feature extraction module, and a prediction module. The spatial feature extraction module, trained on a CNN-SE architecture, is used to capture enhanced local features of the multi-channel feature map. The spatiotemporal feature extraction module, trained on a Bidirectional Long Short-Term Memory (BiLSTM) network, is used to perform opening and closing timing feature analysis on the enhanced local features to obtain spatiotemporal comprehensive features. The prediction module maps the spatiotemporal comprehensive features to a state category space, and uses an activation function to output the probability distribution of each state category. The state category with the highest probability is the fault diagnosis result of the circuit breaker under test.
3. The method according to claim 2, characterized in that, The multi-channel feature map is obtained based on the following process: The time-series signals are normalized to obtain several normalized sampling point data. Using the sampled data as the cosine angle value and the corresponding sampling timestamp as the radius, the sampled data is transformed into polar coordinates to obtain the polar angle corresponding to each sampled data. Based on the trigonometric difference function of the polar angle between each sampling point data, a Gram angle difference field matrix is constructed to obtain each single-channel two-dimensional image; wherein, the height dimension and width dimension of the single-channel two-dimensional image together represent the Gram angle difference between sampling point data at different times; All single-channel two-dimensional images are stitched together along the channel dimension to obtain the multi-channel feature map.
4. The method according to claim 3, characterized in that, The spatial feature extraction module includes multiple cascaded convolutional processing modules. Each convolutional processing module includes a convolutional layer, SE attention, and a max pooling layer. Each convolutional layer uses convolutional kernels of different sizes to extract features from the input feature map. SE attention is used to obtain the dependencies between different channels to obtain channel weights that reflect the importance of each channel. The channel weights are multiplied with the input feature map channel by channel to obtain the local features output by the current convolutional processing module. The output of the last convolutional processing module is the enhanced local feature.
5. The method according to claim 4, characterized in that, The spatiotemporal integrated features are obtained based on the following process: The enhanced local features are sliced along the height dimension to obtain the feature vector at each time. The feature vectors at all times are spliced and reconstructed to obtain the feature sequence of the circuit breaker under test during the reclosing process. The spatiotemporal feature extraction module is used to analyze the dynamic temporal relationship between the feature sequences to obtain the spatiotemporal comprehensive features.
6. The method according to any one of claims 1-5, characterized in that, The hybrid diagnostic network is trained based on the following process: The training set is obtained by labeling the state types of the multi-source time-series signals collected in history; Gram difference field coding is performed on each time-series signal in the training set to obtain the corresponding single-channel two-dimensional image, which is then spliced and fused to obtain a multi-channel feature map; Using the multi-channel feature map as input and each state type as output, a hybrid CNN-SE-BiLSTM network is trained based on a hybrid loss function, and the network hyperparameters are optimized by adaptive HEBKA to obtain the hybrid diagnostic network.
7. The method according to claim 6, characterized in that, The status categories include: normal status, trip coil jamming, closing coil jamming, energy storage motor failure, left mechanical transmission jamming, right mechanical transmission jamming, contact erosion, insulation medium deterioration, and control circuit abnormality.
8. The method according to claim 6, characterized in that, The multi-channel current signals include the trip coil current, the closing coil current, the energy storage motor current, the lockout coil current, and the total current of the secondary control circuit.
9. The method according to claim 6, characterized in that, The hybrid loss function is: , in, For a mixed loss function, These are adjustable weighting coefficients. For weighted cross-entropy loss, This is the loss for multi-source signal cooperative regularization.
10. The method according to claim 6, characterized in that, The network hyperparameters include the learning rate and the number of convolutional kernels.