A wind turbine fault diagnosis method based on TSA-xLSTM-CL deep learning neural network

By using the TSA-xLSTM-CL deep learning neural network, combined with TSA feature extraction and xLSTM long-term dependency modeling, the problem of complex pattern recognition in wind turbine fault diagnosis is solved, achieving efficient and accurate fault diagnosis, reducing false alarm rate and false negative rate, and improving the stability and operating efficiency of wind power system.

CN122196644APending Publication Date: 2026-06-12NORTH CHINA ELECTRIC POWER UNIV +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NORTH CHINA ELECTRIC POWER UNIV
Filing Date
2024-12-05
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing fault diagnosis methods for wind turbine generators rely on human experience and traditional sensors, which are difficult to cope with complex and ever-changing fault modes. They have high false alarm and false alarm rates and lack vibration monitoring and fault diagnosis functions for key components such as gearbox bearings. Traditional statistical methods are also difficult to adapt to complex nonlinear relationships.

Method used

A fault diagnosis method based on TSA-xLSTM-CL deep learning neural network is adopted. Combining TSA feature extraction and xLSTM long-term dependency modeling capabilities, it automatically identifies complex fault modes through multi-dimensional feature extraction and continuous learning, thereby improving diagnostic accuracy and robustness.

🎯Benefits of technology

It significantly improves the accuracy of wind turbine fault diagnosis, reduces false alarms and missed alarms, and ensures the stability and operating efficiency of wind power systems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure SMS_4
    Figure SMS_4
  • Figure SMS_5
    Figure SMS_5
  • Figure SMS_6
    Figure SMS_6
Patent Text Reader

Abstract

The application discloses a wind turbine fault diagnosis method based on a TSA-xLSTM-CL deep learning neural network, and belongs to the field of wind turbine fault diagnosis. Main steps of the fault diagnosis method are as follows: firstly, original data of a wind turbine operation state are acquired, data are preprocessed, and feature expansion is carried out, so as to serve as pre-training data; secondly, a base TSA-xLSTM deep learning neural network model is established, and pre-data are used for pre-training. Finally, after processing of actually collected data, the TSA-xLSTM-CL deep learning model is trained for multiple times by using continuous learning, and is put into actual use. The TSA-xLSTM-CL deep learning neural network is used, dynamic features can be automatically extracted from time series data, the model is continuously optimized through continuous learning, and new fault mode recognition capability is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of wind turbine generator fault diagnosis technology, specifically to a wind turbine generator fault diagnosis method based on a TSA-xLSTM-CL deep learning neural network. Background Technology

[0002] As a clean energy source, wind energy is becoming increasingly important in the context of the current global energy transition, leading to the rapid development of wind power generation. However, wind turbine generators operate in variable climates and harsh environments, especially critical components such as gearbox bearings, which are subject to high failure rates due to extreme weather conditions such as strong winds, snow, and lightning. These failures not only reduce power generation efficiency but also increase maintenance costs, severely impacting the economic benefits of wind farms. Therefore, effective fault monitoring and diagnosis of wind turbine generators are crucial for ensuring stable operation, improving power generation efficiency, and reducing maintenance costs.

[0003] Currently, the main challenges in fault diagnosis for wind turbine generators include: First, because wind turbine generators are mostly located in remote areas with harsh environments, traditional fault diagnosis methods rely on human experience and traditional sensor technology, making it difficult to cope with complex and ever-changing fault modes, and resulting in high false alarm and false negative rates. Second, while existing SCADA systems can monitor signals such as current, voltage, and power of wind turbine generators, they lack vibration monitoring and fault diagnosis capabilities for critical components such as gearbox bearings. Furthermore, there is a complex nonlinear relationship between the fault characteristics and faults in wind turbine generators, making it difficult for traditional statistical methods to adapt to the complex and ever-changing operating conditions.

[0004] To address these challenges, a fault diagnosis method based on the TSA-xLSTM-CL deep learning neural network demonstrates significant advantages. This method combines TSA feature extraction technology with the long-term modeling capabilities and continuous learning mechanism of xLSTM. Through multi-dimensional feature extraction, it can more accurately identify complex fault modes, improving diagnostic accuracy. Simultaneously, the TSA-xLSTM-CL deep learning neural network leverages its powerful memory and state tracking capabilities to effectively handle long-term dependencies and complex data patterns in wind turbine fault data, providing reliable support for real-time diagnosis. The introduction of continuous learning technology allows the model to continuously optimize with new data and environmental changes, maintaining efficient and accurate diagnostic capabilities. The data-driven nature of the TSA-xLSTM-CL deep learning neural network can efficiently analyze and extract information from massive, multi-source, and high-dimensional data, adapting to the dynamic uncertainties and coupling in complex systems, providing an efficient and accurate solution for intelligent fault diagnosis of wind turbines. Through the application of this method, the accuracy of wind turbine fault diagnosis has been significantly improved, and the false alarm and false negative rates have been effectively reduced, providing a strong guarantee for the stability and operational efficiency of wind power systems. Summary of the Invention

[0005] This invention proposes a wind turbine fault diagnosis method based on a TSA-xLSTM-CL deep learning neural network. By combining SCADA data over a certain period with its sample entropy and power spectral entropy to form a multi-dimensional data sequence, and utilizing the TSA-xLSTM-CL deep learning neural network—combining the self-attention mechanism of the TSA module and the long-term dependency modeling capability of xLSTM—dynamic features can be automatically extracted from time-series data. Through continuous learning, the model is continuously optimized, improving the ability to identify new fault modes. This method not only increases diagnostic accuracy but also possesses strong robustness and scalability, enabling it to handle complex fault diagnosis tasks of wind turbines under different operating environments.

[0006] To achieve the above objectives, the main steps of the technical solution adopted in this invention are as follows:

[0007] Step (1): Obtain the raw SCADA data of the wind turbine's operating status, preprocess the data, and perform feature expansion to use it as pre-training data;

[0008] Step (2): Establish a wind turbine fault diagnosis model based on TSA-xLSTM deep learning neural network, and train it using pre-training data to obtain a pre-trained model for wind turbine fault diagnosis.

[0009] Step (3): After processing the actual collected data, input it into the pre-trained model for fault diagnosis analysis, output the wind turbine fault diagnosis results, use the continuous learning method to incrementally train the pre-trained model for wind turbine fault diagnosis, and use multiple performance indicators for comprehensive evaluation. Finally, put the TSA-xLTM-CL deep learning neural network model with the best performance into use.

[0010] Furthermore, in step (1), the SCADA raw data includes feature vector data and fault label data; the SCADA raw data mainly includes feature vector raw data and fault label raw data; the data preprocessing method is to normalize the feature vector raw data to scale the values ​​to the range of [0,1], and calculate its sample entropy and power spectrum entropy through a sliding window, and use the sample entropy and power spectrum entropy as two new feature vectors, which are combined with the raw data to form a multidimensional time series, and a set of time series data corresponds to a fault label.

[0011] Furthermore, in step (2), the TSA-xLSTM deep learning neural network model mainly consists of a feature extraction module, a feature processing module, and a feature classification module; the feature extraction module is a TSA module, the feature processing module is an xLSTM module, and the feature classification module mainly uses fully connected layers and the softmax function; the so-called pre-training operation is to initialize the model, reasonably set the hyperparameters and stack the network weights, and train using the training set in step (1) to obtain the pre-trained model.

[0012] Furthermore, in step (3), the training data is input into the TSA-xLSTM-CL deep learning neural network. The running data needs to pass through the feature extraction module and feature processing module in the TSA-xLSTM-CL deep learning neural network in sequence, and finally pass through the fault classification module to output the reconstructed fault label. Evaluation using cross-entropy loss function The Jaccard similarity between y and y is used as the loss function for model training; the model training method is a continuous learning method. If a fault label that the model has not seen before appears during the training process, the fault data will be saved and the model will be retrained and optimized according to the new data in step (2).

[0013] Furthermore, in step (3), the TSA-xLSTM-CL deep learning neural network model is trained multiple times, and after each training session, multiple evaluation metrics, including accuracy, exact matching, F1 score, and Hamming loss, are used to comprehensively evaluate the model. The TSA-xLSTM-CL deep learning neural network model with the best performance metrics is then used as the fault diagnosis model for wind turbine units and put into practical application.

[0014] The beneficial effects achieved by this invention are as follows: It proposes a wind turbine fault diagnosis method using a TSA-xLSTM-CL deep learning neural network. By utilizing wind turbine operating status data, feature expansion is performed and training data is generated. The TSA-xLSTM-CL deep learning neural network model is used to efficiently extract features from the operating data and integrate the features, thereby improving the accuracy of wind turbine fault diagnosis and detection. Attached Figure Description

[0015] Figure 1 This is an overall flowchart of the method of the present invention;

[0016] Figure 2 This is a structural diagram of the wind turbine fault diagnosis model of the present invention; Detailed Implementation

[0017] The present invention will now be described in detail with reference to the accompanying drawings, and specific operating methods and implementation steps will be provided:

[0018] Step (1): The specific steps for obtaining the training data for the TSA-xLSTM-CL deep learning neural network are as follows:

[0019] Step (1.1): Obtain the original feature vector data and fault label data from the original SCADA data to establish the original dataset D. o ={(x i ,y i x | i = 1, 2, ..., N}, where N is the size of the dataset, x i The original data for the multidimensional feature vector time series of wind turbine operation. For x i The corresponding fault label vector, n L The length of the fault label vector;

[0020] Step (1.2): Normalize the original feature vector data using the following formula:

[0021] Where, x i x is an element in the original data of the feature vector. min x maxThese are the minimum and maximum values ​​of the corresponding feature across all data, respectively.

[0022] Step (1.3): Perform feature expansion on the data by adding sample entropy and power spectrum entropy.

[0023] The steps for calculating sample entropy are as follows:

[0024] 1) Reconstruct the original time series into an m-dimensional vector sequence. Each vector is: X m (i)={x(i),x(i+1),…,x(i+m-1)},1≤i≤N-m+1

[0025] Where m is the embedding dimension and i is the starting position of the window.

[0026] 2) The similarity between two vectors is measured using the maximum difference between them. It is defined as: d[X m (i),X m (j)]=max(|x(i+k)-x(j+k)|), k=0,1,2,…,m-1

[0027] 3) For each reconstructed vector X m (i) Calculate the other vectors X with all other vectors X m Find the distance of (j) and count the number B of all vectors whose distance is less than or equal to a certain threshold r. i And normalize the similarity, defined as:

[0028] 4) By applying all X m (i) Statistical analysis yields the average similarity of the entire time series at a given threshold r, defined as:

[0029] 5) Increase the sequence reconstruction dimension to m+1, and calculate it according to the steps above:

[0030] 6) The formula for calculating sample entropy is:

[0031] The steps for calculating the power spectral entropy are as follows:

[0032] 1) Perform a discrete Fourier transform on the time series x(n) to obtain the frequency domain representation X(f);

[0033] 2) Calculate the power spectrum |X(f)| for each frequency component. 2 ;

[0034] 3) Normalize the power spectrum and calculate the relative power at each frequency:

[0035] 4) The formula for calculating the power spectral entropy is:

[0036] Step (1.4): Assemble the processed data into a training set. D={(x' i SampEn i P(f) i ,y i |i = 1, 2, ..., N T )}

[0037] Where, x' i The original data is normalized, SampEn is the sample entropy, P(f) is the power spectral entropy, and y i For the fault labels corresponding to the running data, N T This is the size of the training set.

[0038] Step (2): The specific steps for establishing the TSA-xLSTM deep learning neural network model are as follows:

[0039] Step (2.1): Its structural diagram is as follows Figure 2 As shown, the system includes a feature extraction module, a feature processing module, and a fault classification module. The forward propagation calculation process of each module is as follows:

[0040] The feature extraction module consists of DSW embedding and a TSA layer. DSW embedding is the first step in feature extraction; it segments the multivariate time series data into multiple segments and embeds them into a two-dimensional vector array. The formula is: x (s)i,d ={x t,d |(i-1)×L seg <t≤i×L seg}

[0041] Where, x (s)i,d L represents the data of the i-th segment in dimension d. seg It is the length of the segment.

[0042] The TSA layer consists of two stages: cross-time attention and cross-dimensional attention.

[0043] Multi-head self-attention is applied to each dimension across time phases:

[0044] Among them, Z:,d MSA represents the data at all time steps along dimension d. time This represents a multi-head self-attention mechanism, and MLP represents a multilayer perceptron.

[0045] In the cross-dimensional phase, a "router" mechanism is used to reduce computational complexity:

[0046] Where: R represents the router vector, B represents the aggregated message, and MSA dim1 and MSA dim2 These represent two self-attention operations in the cross-dimensional stage.

[0047] Following these two stages, the TSA layer will cross time and dimensions to obtain the final feature representation. The formula is as follows:

[0048] in, These are the fused features. MLP stands for Multilayer Perceptron, which is used for further processing and feature fusion.

[0049] The feature processing module is mainly composed of xLSTM. The feature processing of xLSTM is mainly accomplished through its internal sLSTM and mLSTM blocks.

[0050] The feature processing of sLSTM is accomplished using the following formula:

[0051] 1) Exponential Input Gate:

[0052] in, It is the pre-activation value of the input gate, w i The input weights, x t This is the current input, r i It is the recursive weight, h t-1 It is the hidden state of the previous time step, b i It is a bias term.

[0053] 2) The Forgotten Gate of the Index: or

[0054] in, It is the pre-activation value of the Forgotten Gate.

[0055] 3) Memory unit update: c t =ft c t-1 +i t z t

[0056] Among them, z t φ is the candidate memory value for the current time step, and φ is the activation function, usually tanh.

[0057] 4) Normalized state update: n t =f t n t-1 +i t

[0058] Where: normalized state n t Used to adjust the output in the hidden state.

[0059] 5) Hidden state update:

[0060] Among them, o t It is an output gate that controls the output of the memory cell to the hidden state.

[0061] To address the potential numerical instability caused by exponential gating, sLSTM introduces a state variable m. t The formula is: m t =max(log(f) t )+m t-1 ,log(i t )) i′ t =exp(log(i t )-m t ) f′ t =exp(log(f t )+m t-1 -m t )

[0062] Where, i′ t and f′ t These are the input gate and the forget gate for stabilization.

[0063] The feature processing of mLSTM is accomplished using the following formula:

[0064] 1) The core feature of mLSTM is the use of matrix memory units, which can store and process more complex information. The update formula for the matrix memory units is: vt =W v x t +b v

[0065] Among them, C t It is the matrix memory unit of the current time step, f t It is the Gate of Oblivion, i t It's an input gate, v t It is a value vector, k t It is a key vector, W v With W k It is a weight matrix.

[0066] 2) Index Input Gate

[0067] 3) The Forgotten Gate of the Index or

[0068] 4) Normalized state update: n t =f t n t-1 +i t k t

[0069] 5) Hidden state update: q t =W q x t +b q

[0070] Among them, o t It's an output gate, q t It is the query vector, W q It is a weight matrix.

[0071] xLSTM forms an xLSTM block by stacking the residuals of sLSTM and mLSTM blocks together. Through this stacking, xLSTM can efficiently process long sequence data.

[0072] The fault classification module mainly consists of a fully connected layer and a softmax layer. The sequence data processed by the xLSTM module is used as input, mapped to the sample space in the fully connected layer, and then processed by the subsequent softmax layer to obtain a classification probability result, thus identifying and classifying wind turbine faults.

[0073] The formula for the softmax function is:

[0074] This indicates that when there are K linear functions, the sample vector x T The probability of belonging to class j.

[0075] Step (2.2): Pre-train the model. Based on the size of the training data, set the hyperparameters such as the number of layers of xLSTM residual stacking, the dimension of input and output, the learning rate, and the batch size, and use the ADMA optimizer to update the network weights of the model.

[0076] In step (3), sufficient training data is input into the TSA-xLSTM-CL deep learning neural network model for training. The main process is as follows:

[0077] Step (3.1): Input the training data into the TSA-xLSTM-CL deep learning neural network model, and update the network weights of the model using a continuous learning method. The calculation process is as follows:

[0078] Cross-entropy is used as the loss function to calculate the error between the model's output label vector and the true label vector. The calculation formula is as follows:

[0079] Where N is the number of samples, M is the number of classes, and y ij Is it the true case that the i-th sample belongs to the j-th category? It is the probability that the model diagnoses the i-th sample as belonging to the j-th category.

[0080] Next, EWC regularization is added to the loss of the new task, resulting in the EWC loss function:

[0081] Where L is the loss for the new task, and λ is a hyperparameter that balances the importance of the new and old tasks. It is the partial derivative of the old task loss function with respect to the weights, w i This is the current weight. It is the old task weight.

[0082] In incremental training with continuous learning, only the EWC loss function is used to update the model weights. The formula for updating the weights is:

[0083] Where α is the learning rate.

[0084] Step (3.2): During the continuous learning training process, the model needs to retain knowledge from old tasks while learning new tasks. This is achieved by introducing a regularization term. When fault labels appear in the new task that did not exist in the old task, these new fault data are retained. The new fault data is added to the training dataset, and the TSA-xLSTM deep learning neural network model is retrained using the new training dataset.

[0085] Step (3.3): Train the TSA-xLSTM-CL deep learning neural network model multiple times, and use several performance evaluation metrics to comprehensively evaluate the fault diagnosis model. Select the model with the best performance evaluation metrics as the wind turbine fault diagnosis model. The performance evaluation metrics here mainly include accuracy, exact matching, F1 score and Hamming loss, and their calculation formulas are as follows:

[0086] Accuracy provides a quick and intuitive evaluation of the overall model performance, as shown in the following formula:

[0087] Exact matching is used to evaluate the proportion of samples whose true labels and model output labels are exactly the same as those of the total samples. The formula is as follows:

[0088] in, This indicates an indicator function, when y i equal When the condition is met, the function's value is 1; otherwise, it is 0.

[0089] The F1 score is the harmonic mean of precision and recall. A higher precision and recall result in a higher F1 score. The formula is as follows:

[0090] Among them, accuracy Recall rate

[0091] Hamming loss is used to assess the actual proportion of misclassifications on a single label, and its formula is:

[0092] Finally, after each round of training, the model is comprehensively evaluated, and the model with the best performance metrics is put into practical use.

Claims

1. A method for fault diagnosis of wind turbine generators based on a TSA-xLSTM-CL deep learning neural network, characterized in that, The fault diagnosis method includes the following steps: To achieve the above objectives, the main steps of the technical solution adopted in this invention are as follows: Step (1): Obtain the raw SCADA data of the wind turbine's operating status, preprocess the data, and perform feature expansion to use it as pre-training data; Step (2): Establish a wind turbine fault diagnosis model based on TSA-xLSTM deep learning neural network, and train it using pre-training data to obtain a pre-trained model for wind turbine fault diagnosis. Step (3): After processing the actual collected data, input it into the pre-trained model for fault diagnosis analysis and output the wind turbine fault diagnosis results. Then, use the continuous learning method to incrementally train the pre-trained model for wind turbine fault diagnosis, and use multiple performance indicators for comprehensive evaluation. Finally, put the TSA-xLTM-CL deep learning neural network model with the best performance into use.

2. The wind turbine fault diagnosis method based on TSA-xLSTM-CL deep learning neural network according to claim 1, characterized in that: In step (1), the original SCADA data mainly includes original feature vector data and original fault label data. The data preprocessing method is to normalize the original feature vector data. The feature expansion is to calculate its sample entropy and power spectrum entropy through a sliding window. The sample entropy and power spectrum entropy are used as two new feature vectors and combined with the preprocessed data to form a multidimensional time series. A set of time series data corresponds to a fault label.

3. The wind turbine fault diagnosis method based on TSA-xLSTM-CL deep learning neural network according to claim 1, characterized in that: In step (2), the TSA-xLSTM deep learning neural network mainly consists of a feature extraction module, a feature processing module, and a fault classification module. The feature extraction module is a TSA module, the feature processing module is xLSTM, and the fault classification module consists of a fully connected layer and a softmax layer. The pre-training is to initialize the model, set the hyperparameters reasonably and initialize the network weights by stacking, and train the model using the training set from step (1) to obtain the pre-trained model.

4. The wind turbine fault diagnosis method based on TSA-xLSTM-CL deep learning neural network according to claim 1, characterized in that: In step (3), the TSA-xLSTM-CL deep learning neural network updates the network weight parameters of the model using a continuous learning method, mainly using incremental training, and using the EWC loss function as the model's loss function; the performance evaluation indicators of the TSA-xLSTM-CL deep learning neural network mainly include: accuracy, exact match, F1 score, and Hamming loss.

5. The wind turbine fault diagnosis method based on TSA-xLSTM-CL deep learning neural network according to claim 1, characterized in that: After the fault diagnosis model is put into practical use, it is further trained using continuous learning methods. If an unconsidered fault label appears, the current data is retained and the model is retrained.