Improved kan-bilstm method for small sample fault diagnosis with adaptive convolution and denoising

By using an improved conditional generative adversarial network and an adaptive feature enhancement model, the problems of sample scarcity and noise interference in the current fault diagnosis of permanent magnet synchronous motors in electric drive systems are solved, and efficient fault identification and diagnosis are achieved.

CN122241450APending Publication Date: 2026-06-19TIANJIN RES INST OF ELECTRIC SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TIANJIN RES INST OF ELECTRIC SCI
Filing Date
2026-02-14
Publication Date
2026-06-19

Smart Images

  • Figure CN122241450A_ABST
    Figure CN122241450A_ABST
Patent Text Reader

Abstract

This invention relates to an improved KAN-BiLSTM method for few-sample fault diagnosis based on adaptive convolution and denoising, belonging to the field of fault diagnosis. The invention collects real fault signal data from the target equipment; utilizes an improved conditional generative adversarial network to augment the real fault signal data, generating synthetic fault signal data, and constructs a joint dataset containing both real and synthetic data; preprocesses the signal data in the joint dataset and converts it into a time-spectrum graph; inputs the time-spectrum graph into a fault diagnosis model for processing to obtain the fault diagnosis result. This invention solves the problem of scarce fault sample data in current systems, innovates the feature extraction method, and addresses the issue of high confusion between different fault categories. The designed adaptive denoising module solves the problem of weak noise resistance in current engineering field noise-heavy models.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of fault diagnosis technology, and in particular to the improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction. Background Technology

[0002] With the rapid development of modern industrialization, the role of electric drive control systems in modern industry is becoming increasingly prominent. Their structures and components are becoming more complex, and factors such as long operating times and harsh operating environments lead to frequent failures. When a system malfunctions, significant manpower and resources are required for maintenance. Therefore, establishing an accurate fault diagnosis model to monitor the system's operating status in real time, correct errors promptly to avoid greater losses, quickly diagnose system faults, reduce repair time, ensure production line efficiency, and lower maintenance costs is crucial. How to quickly and accurately diagnose faults in electric drive systems, especially fault identification of current signals, has become a topic of common concern for both academia and industry. Summary of the Invention

[0003] The purpose of this invention is to overcome the shortcomings of existing technologies and propose an improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction. Addressing engineering bottlenecks in permanent magnet synchronous motor current fault diagnosis, such as sample scarcity, high inter-class similarity, and severe noise interference, this invention proposes a fault diagnosis framework that integrates an improved conditional generative adversarial network (CGAN) with an adaptive feature enhancement model. By improving the convolutional neural network architecture of the conditional generative adversarial network, high-quality current signal samples are generated. Validated by time-domain waveform, frequency-domain FFT, and t-SNE distribution consistency, the synthesized data achieves a similarity of over 98% with real data and possesses low-dimensional spatial clustering characteristics.

[0004] The technical problem solved by this invention is achieved through the following technical solution: An improved KAN-BiLSTM fault diagnosis method with adaptive convolution and noise reduction includes the following steps: Step 1: Collect real fault signal data of the target device; Step 2: Use an improved conditional generative adversarial network to augment real fault signal data, generate synthetic fault signal data, and construct a joint dataset containing real and synthetic data. Step 3: Preprocess the signal data in the joint dataset and convert it into a time-spectrum graph; Step 4: Input the time spectrum diagram into the fault diagnosis model for processing to obtain the fault diagnosis result.

[0005] Moreover, the generator and discriminator of the improved conditional generative adversarial network in step 3 both adopt a convolutional neural network structure. The original signals in the joint dataset are transformed into a 2D time spectrum after short-time Fourier transform, and the output is five labels: normal, overcurrent, undercurrent, current imbalance, and instantaneous current abnormality.

[0006] Moreover, the fault diagnosis model in step 4 includes an adaptive noise reduction module, an adaptive convolution module, an improved KAN-BiLSTM model, and a data output module. The adaptive noise reduction module is connected to the adaptive convolution module, the adaptive noise reduction module and the adaptive convolution module are respectively connected to the improved KAN-BiLSTM model, and the improved KAN-BiLSTM model is connected to the data output module. The adaptive noise reduction module is used to dynamically adjust the noise figure based on the local standard deviation of the input signal and output a multi-channel feature map after noise reduction. The adaptive convolution module is used to extract convolutional features from the denoised multi-channel feature map through four parallel convolution operations and an adaptive weighted fusion mechanism, and output the convolutional feature map. The improved KAN-BiLSTM model input feature map includes the joint output of the noise reduction and convolution modules. An improved KAN is introduced to enhance the nonlinear expressive power of the model, and BiLSTM is introduced to capture the bidirectional time dependence of the signal, thereby enhancing the model's feature extraction capability for time series data. The data output module uses the Softmax activation function to generate the classification probability distribution of five types of current signals based on the improved KAN-BiLSTM model.

[0007] Furthermore, the adaptive noise reduction module performs adaptive noise reduction filtering as follows: The local standard deviation of the input signal or feature map is calculated as a quantitative indicator of noise intensity; the local standard deviation is mapped to the noise figure through a preset nonlinear mapping function; and the intensity of noise reduction filtering is dynamically adjusted based on the noise figure and feature protection constraints.

[0008] Furthermore, the adaptive convolution module includes four parallel convolutional layers and an adaptive weighted fusion unit; the four parallel convolutional layers include: a cross-center differential convolutional layer, a pixel differential convolutional layer, a spectral modulation convolutional layer, and a standard convolutional layer; the adaptive weighted fusion unit dynamically generates weights based on the input feature map through a gating network, which are used to weight and fuse the output feature maps of the four parallel convolutional layers.

[0009] Furthermore, the specific implementation method of the improved KAN-BiLSTM model is as follows: In the activation function of KAN, two learnable weight parameters are set and optimized during training so that the combined weights depend on the input. A gating network is introduced to dynamically generate w1 and w2 based on the input x. Meanwhile, to ensure the smooth continuity of the function and simplify the parameter setting problem, a constraint is added to w1 and w2: w1+w2=1; Dynamic generation using multilayer perceptron (MLP) ; Introduce the cross-entropy loss function and add a regularization term to the cross-entropy loss. To promote Smoothness; Introducing a measure of inter-class separability allows fault samples of the same class to cluster in the feature space, while fault samples of different classes are significantly separated, thus improving the ability to identify faults in small samples.

[0010] The advantages and positive effects of this invention are: This invention collects real fault signal data from target equipment; utilizes an improved conditional generative adversarial network to augment the real fault signal data, generating synthetic fault signal data, and constructs a joint dataset containing both real and synthetic data; preprocesses the signal data in the joint dataset and converts it into a time-spectrum graph; inputs the time-spectrum graph into a fault diagnosis model for processing to obtain the fault diagnosis result. This invention solves the problem of scarce fault sample data in current systems, innovates the feature extraction method, and addresses the issue of high confusion between different fault categories. The designed adaptive noise reduction module solves the problem of weak noise resistance in current engineering field noise-heavy models. Attached Figure Description

[0011] Figure 1 A diagram illustrating the training comparison results of a conditional generative adversarial network (using a fully connected neural network) and an improved network (using a CNN); Figure 2 This is a diagram illustrating the overall design architecture of the present invention; Figure 3 This is an experimental verification diagram of the adaptive convolution module and ADM module of the present invention; Figure 4 This is a cross-validation diagram of different models, including the KAN-BiLSTM of this invention, the improved KAN-BiLSTM, and the fusion model of this patent. Figure 5 The proposed model is based on the feature distribution map of t-SNE. Detailed Implementation

[0012] The present invention will be further described in detail below with reference to the accompanying drawings.

[0013] An improved KAN-BiLSTM method for few-shot fault diagnosis, incorporating adaptive convolution and noise reduction, such as... Figure 2 As shown, it includes the following steps: Step 1: Collect real fault signal data of the target device.

[0014] Step 2: Use an improved conditional generative adversarial network to augment real fault signal data, generate synthetic fault signal data, and construct a joint dataset containing real and synthetic data.

[0015] Step 3: Preprocess the signal data in the joint dataset and convert it into a time-spectrum graph.

[0016] First, to address the problem of insufficient current fault samples in permanent magnet synchronous motors (PMSMs), this invention employs a conditional generative adversarial network (CGAN) to generate synthetic current signals. Since the current signal is time-series data, to enhance the network's temporal modeling capabilities, both the generator and discriminator are replaced with convolutional neural networks (CNNs) instead of the original simple fully connected neural networks, using ReLU and LeakyReLU activation functions. The training results of the conditional generative adversarial network (using a fully connected neural network) and the improved network (using a CNN) are shown in the figure below. Figure 1 As shown, the improved conditional generative adversarial network has better final losses in both the discriminator and generator than the original conditional generative adversarial network, and can generate higher quality current signal samples.

[0017] Step 4: Input the time spectrum diagram into the fault diagnosis model for processing to obtain the fault diagnosis result.

[0018] The fault diagnosis model includes an adaptive denoising module, an adaptive convolution module, an improved KAN-BiLSTM model, and a data output module. The adaptive denoising module is connected to the adaptive convolution module, and the adaptive denoising module and the adaptive convolution module are respectively connected to the improved KAN-BiLSTM model. The improved KAN-BiLSTM model is connected to the data output module.

[0019] The Adaptive Denoising Module (ADM) is used to dynamically adjust the noise figure based on the local standard deviation of the input signal and output a multi-channel feature map after noise reduction.

[0020] Because motor current signals are highly susceptible to operating conditions in industrial settings (such as inverter switching noise and load fluctuations), traditional fixed-threshold noise reduction methods struggle to adapt to dynamic conditions, leading to fault characteristics being masked by noise. Therefore, this invention designs an Adaptive Noise Reduction Module (ADM) to reduce noise impact and enhance the model's anti-interference capability. ADM utilizes a dynamic noise reduction intensity adjustment mechanism based on standard deviation to address the problems of excessive or insufficient noise reduction in traditional methods. First, the standard deviation is used as a real-time quantitative indicator of noise intensity. Then, a nonlinear mapping relationship between the noise figure and the standard deviation is established. Finally, feature protection constraints in the adaptive noise reduction filter prevent the loss of fault information due to excessive noise reduction, achieving effective noise suppression while preserving fault characteristics.

[0021] The Adaptive Convolutional Module (ACM) extracts convolutional features from the denoised multi-channel feature map using four parallel convolutional operations and an adaptive weighted fusion mechanism, outputting the convolutional feature map. The advantages of ACM lie in its dynamic adjustment of the receptive field. Through a gating mechanism, it dynamically adjusts the contribution of different convolutions, allowing the model to adaptively focus on key regions. Combining four different convolutional operations enhances the model's ability to capture different fault features. Due to the scarcity of fault data, the ACM module reduces its dependence on training data volume through efficient feature extraction. The ACM module consists of four parallel convolutional operations and an adaptive weighted fusion mechanism, with the following specific structure: The ACM module includes the following four convolutional layers, each focusing on different local feature patterns: 1. Cross-center difference convolution: Decouples the local region into two intersecting directions (horizontal, vertical and diagonal), calculates the difference between the center pixel and the surrounding pixels separately, and highlights the local contrast features.

[0022] 2. Pixel Differential Convolution: Calculates the difference between pairs of pixels instead of performing convolution operations directly on individual pixels.

[0023] 3. Spectrum Modulation Convolution: The Fourier transform is used to convert the signal from the time domain to the frequency domain, and the multiplication operation in the frequency domain is used instead of the convolution operation in the time domain.

[0024] 4. Standard convolution: Performs conventional local feature extraction.

[0025] The improved KAN-BiLSTM model input feature map includes the joint output of the noise reduction and convolution modules. An improved KAN is introduced to enhance the nonlinear expressive power of the model, and BiLSTM is introduced to capture the bidirectional time dependence of the signal, thereby enhancing the model's feature extraction capability for time series data. The data output module uses the Softmax activation function to generate the classification probability distribution of five types of current signals based on the improved KAN-BiLSTM model.

[0026] The adaptive noise reduction module performs adaptive noise reduction filtering as follows: The local standard deviation of the input signal or feature map is calculated as a quantitative indicator of noise intensity; the local standard deviation is mapped to the noise figure through a preset nonlinear mapping function; and the intensity of noise reduction filtering is dynamically adjusted based on the noise figure and feature protection constraints.

[0027] The adaptive convolution module includes four parallel convolutional layers and one adaptive weighted fusion unit. The four parallel convolutional layers are: a cross-center differential convolutional layer, a pixel-differential convolutional layer, a spectral modulation convolutional layer, and a standard convolutional layer. The adaptive weighted fusion unit dynamically generates weights based on the input feature map through a gating network, which are used to weight and fuse the output feature maps of the four parallel convolutional layers. The specific implementation method of the improved KAN-BiLSTM model is as follows: In the activation function of KAN, two learnable weight parameters are set and optimized during training so that the combined weights depend on the input. A gating network is introduced to dynamically generate w1 and w2 based on the input x. Meanwhile, to ensure the smooth continuity of the function and simplify the parameter setting problem, a constraint is added to w1 and w2: w1+w2=1; Dynamic generation using multilayer perceptron (MLP) ; Introduce the cross-entropy loss function and add a regularization term to the cross-entropy loss. To promote Smoothness; Introducing a measure of inter-class separability allows fault samples of the same class to cluster in the feature space, while fault samples of different classes are significantly separated, thus improving the ability to identify faults in small samples.

[0028] Based on the improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction described above, through methods such as... Figure 3 The ablation experiments of the ACM and ADM modules shown are as follows: Figure 4 The cross-validation shown, and as shown Figure 5 The noise robustness verification shown demonstrates the reliability of the dual-drive solution for data generation and feature enhancement constructed in this patent to address the engineering challenges of PMSM current fault diagnosis, such as few samples, high confusion, and strong noise.

[0029] like Figure 3 As shown, adding the ACM module alone improved the accuracy by 2.36% and the precision by 3.25%, indicating that the ACM module effectively enhances the model's feature extraction capability by adaptively adjusting the dynamic receptive field. Adding the ADM module alone improved accuracy by 1.85% and precision by 2.36%, indicating that the ADM module effectively suppressed noise interference through its adaptive noise reduction mechanism, thereby improving the robustness of the model. Adding two models (i.e., the complete model in this paper) simultaneously achieved the best performance, with an accuracy of 98.35% and a loss of 0.0021. This represents a significant improvement compared to the single-module configuration, with accuracy increases of 2.13% and 1.97%, respectively. This demonstrates that the combined effect of the ACM and ADM modules can more comprehensively improve the model's performance.

[0030] In summary, the analysis shows that ACM, by dynamically adjusting the receptive field, enables the model to adapt to feature changes at different scales, thereby enhancing the model's feature capture capability. Meanwhile, the ADM module, through its adaptive denoising mechanism, effectively filters noise interference in the input data, achieving both fault feature preservation and noise suppression, providing cleaner data input for subsequent processing. The combination of these two technologies enhances both the model's feature extraction capability and its noise resistance, thus improving the overall performance of the model.

[0031] like Figure 4 As shown, after hierarchical 5-fold cross-validation, the stability of the improved KAN-BiLSTM model increased from 1.29 to 0.81 compared to the basic model, and the complete model achieved optimal robustness. The mean accuracy of 98.03% and the confidence interval of [97.72%, 98.34%] in the cross-validation demonstrate the reliability of the proposed model.

[0032] like Figure 5 As shown, the complete model exhibits the best clustering performance across all noise levels. Even at high noise levels (SNR=-30dB), different fault categories are clearly separated, and samples within the same category at different noise levels cluster together (exhibiting invariance to noise).

[0033] The synergy between the ACM and ADM modules is crucial. The ADM module first denoises the input signal, then the ACM module further extracts robust features, and finally, BiLSTM performs temporal modeling. The combination of these two modules reduces noise interference while enhancing feature representation capabilities.

[0034] It should be emphasized that the embodiments described in this invention are illustrative rather than limiting. Therefore, this invention includes, but is not limited to, the embodiments described in the specific implementation. Any other implementations derived by those skilled in the art based on the technical solutions of this invention are also within the scope of protection of this invention.

Claims

1. An improved KAN-BiLSTM fault diagnosis method with adaptive convolution and noise reduction, characterized by: Includes the following steps: Step 1: Collect real fault signal data of the target device; Step 2: Use an improved conditional generative adversarial network to augment real fault signal data, generate synthetic fault signal data, and construct a joint dataset containing real and synthetic data. Step 3: Preprocess the signal data in the joint dataset and convert it into a time-spectrum graph; Step 4: Input the time spectrum diagram into the fault diagnosis model for processing to obtain the fault diagnosis result.

2. The improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction according to claim 1, characterized in that: In step 3, the improved conditional generative adversarial network uses a convolutional neural network structure for both the generator and the discriminator. The original signals in the joint dataset are transformed into a 2D time-spectrum graph through a short-time Fourier transform, and the output consists of five labels: normal, overcurrent, undercurrent, current imbalance, and instantaneous current anomaly.

3. The improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and denoising according to claim 1, characterized in that: The fault diagnosis model in step 4 includes an adaptive noise reduction module, an adaptive convolution module, an improved KAN-BiLSTM model, and a data output module. The adaptive noise reduction module is connected to the adaptive convolution module, the adaptive noise reduction module and the adaptive convolution module are respectively connected to the improved KAN-BiLSTM model, and the improved KAN-BiLSTM model is connected to the data output module. The adaptive noise reduction module is used to dynamically adjust the noise figure based on the local standard deviation of the input signal and output a multi-channel feature map after noise reduction. The adaptive convolution module is used to extract convolutional features from the denoised multi-channel feature map through four parallel convolution operations and an adaptive weighted fusion mechanism, and output the convolutional feature map. The improved KAN-BiLSTM model input feature map includes the joint output of the noise reduction and convolution modules. An improved KAN is introduced to enhance the nonlinear expressive power of the model, and BiLSTM is introduced to capture the bidirectional time dependence of the signal, thereby enhancing the model's feature extraction capability for time series data. The data output module uses the Softmax activation function to generate the classification probability distribution of five types of current signals based on the improved KAN-BiLSTM model.

4. The improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction according to claim 1, characterized in that: The adaptive noise reduction module performs adaptive noise reduction filtering as follows: The local standard deviation of the input signal or feature map is calculated as a quantitative indicator of noise intensity; the local standard deviation is mapped to the noise figure through a preset nonlinear mapping function; and the intensity of noise reduction filtering is dynamically adjusted based on the noise figure and feature protection constraints.

5. The improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and noise reduction according to claim 2, characterized in that: The adaptive convolution module includes four parallel convolutional layers and one adaptive weighted fusion unit; The four parallel convolutional layers include: a cross-center differential convolutional layer, a pixel differential convolutional layer, a spectral modulation convolutional layer, and a standard convolutional layer; The adaptive weighted fusion unit uses a gating network to dynamically generate weights based on the input feature map, which are then used to weight and fuse the output feature maps of the four parallel convolutional layers.

6. The improved KAN-BiLSTM small-sample fault diagnosis method with adaptive convolution and denoising according to claim 3, characterized in that: The specific implementation method of the improved KAN-BiLSTM model is as follows: In the activation function of KAN, two learnable weight parameters are set and optimized during training so that the combined weights depend on the input. A gating network is introduced to dynamically generate w1 and w2 based on the input x. Meanwhile, to ensure the smooth continuity of the function and simplify the parameter setting problem, a constraint is added to w1 and w2: w1+w2=1; Dynamic generation using multilayer perceptron (MLP) ; Introduce the cross-entropy loss function and add a regularization term to the cross-entropy loss. To promote Smoothness; Introducing a measure of inter-class separability allows fault samples of the same class to cluster in the feature space, while fault samples of different classes are significantly separated, thus improving the ability to identify faults in small samples.