A method, system and medium for dimension reduction and self-encoding fan temperature anomaly early warning
By combining Dt-SNE and importance-weighted autoencoder, the problem of false alarms and missed alarms in traditional wind turbine temperature monitoring methods under varying operating conditions is solved, enabling real-time and accurate early warning of wind turbine temperature anomalies and improving the system's adaptability and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG ZHENENG JIAXING OFFSHORE WIND POWER CO LTD
- Filing Date
- 2025-06-16
- Publication Date
- 2026-06-26
AI Technical Summary
Traditional wind turbine temperature monitoring methods are difficult to adapt to changing operating conditions, have a high false alarm rate and a high risk of missed alarms, do not make full use of data correlation, and are difficult to process high-dimensional data.
An anomaly detection model is constructed using a dimensionality reduction and autoencoder approach. The Dt-SNE algorithm is used for feature dimensionality reduction, and the correlation of wind turbine components is combined with an importance-weighted autoencoder (IWAE) model to calculate anomaly scores and trigger early warnings.
It enables real-time and accurate early warning of abnormal wind turbine temperatures, reduces false alarm and missed alarm rates, and improves the adaptability and accuracy of the early warning system.
Smart Images

Figure CN120632735B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of wind turbine generator condition monitoring and fault diagnosis, and particularly to a method, system, and medium for early warning of abnormal wind turbine temperatures using dimensionality reduction and self-encoding. This method enables real-time monitoring of the temperature of key components of a wind turbine generator (such as the generator, gearbox, and pitch system) and provides early warnings of potential temperature anomalies, thereby improving the operational reliability and safety of the wind turbine generator and reducing maintenance costs. Background Technology
[0002] Wind turbine generators typically operate in harsh environments, making them prone to various malfunctions. Abnormal temperature is one of the key signs of wind turbine failure. Traditional temperature monitoring methods usually trigger alarms by setting fixed thresholds, but this method has the following problems:
[0003] 1. Difficulty in setting thresholds: The temperature of wind turbines is affected by a variety of factors (such as ambient temperature, wind speed, power, etc.), and fixed thresholds are difficult to adapt to various operating conditions.
[0004] 2. High false alarm rate: Due to fluctuations in environmental factors, false alarms are easily triggered.
[0005] 3. Risk of underreporting: For some slowly developing temperature anomalies, fixed thresholds may not be able to detect them in a timely manner.
[0006] 4. Failure to consider data correlation: Traditional methods do not fully utilize the correlation between temperature data of various components of the wind turbine.
[0007] 5. Difficulty in processing high-dimensional data: Wind turbine SCADA systems generate a large amount of high-dimensional data, and traditional dimensionality reduction methods have failed to combine knowledge of the wind power field to achieve processing that is more in line with the underlying mechanisms.
[0008] Therefore, a smarter and more accurate method for early warning of temperature anomalies is needed, which can adapt to different working conditions, reduce false alarm and false alarm rates, and make full use of the correlation between data. Summary of the Invention
[0009] The technical solution of this invention aims to solve the above problems and provides a method for early warning of abnormal fan temperature using dimensionality reduction and self-encoding, comprising the following steps:
[0010] S1, Data Acquisition and Feature Construction: Data without downtime faults within a set time period is acquired from the wind turbine SCADA system as the initial dataset. The initial data is cleaned, preprocessed, and feature-processed. Each data or feature combination with the same timestamp is used to obtain a feature vector. The feature vectors of all times form the sample feature matrix.
[0011] S2, Feature Dimensionality Reduction and Manifold Construction Based on Manifold Learning: The feature vectors obtained in S1 are reduced in dimension using the Dt-SNE (Density-t-distributed Stochastic Neighbor Embedding) algorithm, which combines the correlation of wind turbine components for weighting. The high-dimensional features are mapped to the low-dimensional manifold space to obtain the low-dimensional representation matrix.
[0012] S3, Anomaly detection model construction based on importance-weighted autoencoder: The importance-weighted autoencoder (IWAE) model is trained using the results of S2, the data is mapped to multiple latent variables, and the input data is reconstructed from the latent variables. The probability distribution of the data is learned by minimizing the loss function of IWAE, and then the model is trained. The anomaly score is calculated using the trained model, and the threshold of the anomaly score is determined based on the test data.
[0013] S4 collects real-time operating data of the wind turbine and inputs it into the trained model to calculate the anomaly score. If the anomaly score exceeds the set threshold, an early warning is triggered.
[0014] Further, step S1 specifically includes:
[0015] S1.1, Collect temperature data of key components from the wind turbine SCADA system, including generator bearing temperature, gearbox oil temperature, pitch motor temperature, ambient temperature, wind speed, and power. Perform missing value processing and outlier processing on each type of data in sequence. Align the preprocessed monitoring data according to the collection timestamp (by minute) to form a sample matrix. When obtaining the initial dataset from the wind turbine SCADA system, select a unit of the same type that has not experienced any shutdown failures in the past three months, and select its normal power generation status data from its SCADA system as the initial data.
[0016] S1.2 adds three features: the rate of change of temperature of each component, the temperature difference between each component and the ambient temperature, and the moving average of each temperature variable. The sample matrix and the three features are then combined to obtain the sample feature matrix.
[0017] Furthermore, missing values were filled using linear interpolation; box plots were used to detect and remove outliers from the initial dataset, specifically for monitoring data of each monitoring category. Make a judgment:
[0018]
[0019] Q1 and Q3 are the lower quartile and upper quartile, respectively; for The elements in the table represent the raw values of the monitoring data. This represents the result of processing and judging the monitoring data. (Monitoring data) It has P monitoring quantities, N is the time series length, p ranges from 1 to P, and n ranges from 1 to N. After processing, it forms... ;
[0020] The preprocessed monitoring data were aligned based on the collection timestamp (by minute) and then assembled into a sample matrix:
[0021]
[0022] in, p takes the value 1-P, representing the result of a certain monitoring quantity (such as active power) after preprocessing.
[0023] Further, step S2 includes:
[0024] S2.1, Each row of the sample feature matrix obtained in step S1 represents a sample point. A correlation matrix is constructed based on the physical connection and heat conduction relationship between the components of the wind turbine. Each element in the correlation matrix represents the correlation coefficient between any two components. The component weight is calculated based on the correlation matrix to obtain the overall correlation between the component represented by any sample point and other components.
[0025] S2.2, Incorporate component weights into conditional probability calculations to obtain the probability that any two sample points in the sample feature matrix are adjacent;
[0026] S2.3, initialize the low-dimensional matrix, where each row of the low-dimensional matrix represents a sample point, and use the t-distribution to calculate the similarity between any two sample points in the low-dimensional matrix to obtain the similarity probability;
[0027] S2.4, the probability of any two sample points being adjacent in the symmetric sample feature matrix is used to obtain the joint probability. The low-dimensional matrix is optimized by minimizing the KL divergence of the probability distributions of the joint probability and the similarity probability.
[0028] S2.5 uses gradient descent to iteratively optimize the low-dimensional matrix. In each iteration, the position of each sample point in the low-dimensional matrix is updated according to the gradient of the similarity between any two sample points with respect to the KL divergence, until the iteration stops and the low-dimensional matrix is obtained.
[0029] Further, step S3 includes:
[0030] S3.1 defines the structure of the importance-weighted autoencoder (IWAE), which maps a low-dimensional matrix to multiple latent variables, each of which follows a Gaussian distribution. The encoder outputs the mean and log-variance of the Gaussian distribution.
[0031] S3.2 reconstructs each latent variable into an output with the same dimensions as the low-dimensional matrix, defines a loss function between the low-dimensional matrix and network parameters, uses the negative log-likelihood as the outlier score, trains the model, calculates the outlier score using the trained model, and determines the threshold for the outlier score based on the test data.
[0032] Furthermore, the expression for the anomaly score is:
[0033] In the formula: denoted as the outlier score; K is the number of latent variables obtained after dimensionality reduction mapping, and k is the kth latent variable.
[0034] Further, step S4 includes: performing the same preprocessing and feature engineering on the new real-time SCADA data as on the training data, to obtain... .
[0035] This invention also provides a dimensionality reduction and self-encoding fan temperature anomaly early warning system, comprising:
[0036] The monitoring data processing module is used to obtain data on no downtime faults within a set time period from the wind turbine SCADA system as the initial dataset, clean, preprocess and feature-process the initial data to obtain feature vectors, and construct a sample feature matrix.
[0037] The manifold construction module, connected to the monitoring data processing module, is used to reduce the dimensionality of the feature vectors obtained in the monitoring data processing module by using the Dt-SNE algorithm, which combines the correlation of wind turbine components for weighting, and maps the high-dimensional features to the low-dimensional manifold space to obtain a low-dimensional representation matrix.
[0038] The autoencoder construction module, connected to the manifold construction module, is used to train an importance-weighted autoencoder model using dimensionality-reduced data under normal operating conditions, map the data to multiple latent variables, reconstruct the input data from the latent variables, learn the probability distribution of the data by minimizing the loss function of the importance-weighted autoencoder model, and then train the model. The trained model is then used to calculate the anomaly score and determine the threshold of the anomaly score.
[0039] The real-time early warning module, connected to the autoencoder construction module, is used to collect the operating data of the wind turbine in real time, input it into the trained model to calculate the anomaly score, and trigger an early warning if the anomaly score exceeds the set threshold.
[0040] A wind turbine temperature anomaly early warning system based on manifold learning and importance-weighted autoencoder includes a memory and one or more processors. The memory stores executable code, and when the one or more processors execute the executable code, they are used to implement the dimensionality reduction and autoencoder wind turbine temperature anomaly early warning method.
[0041] A computer-readable medium having a program stored thereon, which, when executed by a processor, implements the aforementioned dimensionality reduction and self-encoding method for early warning of abnormal fan temperature.
[0042] Beneficial Effects: This application provides a method, system, and medium for dimensionality reduction and autoencoder-based early warning of wind turbine temperature anomalies. The method preprocesses the raw data and then performs Dt-SNE dimensionality reduction to achieve low-dimensional manifold mapping of high-dimensional temperature-related features. Based on data under normal operating conditions, it constructs an importance-weighted autoencoder model and calculates its anomaly score threshold, providing real-time and accurate criteria for wind turbine temperature status early warning. By combining manifold learning with an importance-weighted autoencoder, high-dimensional, nonlinear wind turbine temperature data is reduced to a low-dimensional manifold space, revealing the data's intrinsic structure. By introducing importance weighting, the probability distribution of the data can be estimated more accurately, enabling effective processing and modeling of complex wind turbine temperature data. Utilizing the probabilistic characteristics of the IWAE model, the anomaly score is calculated by comprehensively considering reconstruction error and latent spatial probability density, improving the adaptability and accuracy of the early warning system and reducing false alarm and false negative rates. Attached Figure Description
[0043] Figure 1 This is a flowchart of Example 1;
[0044] Figure 2 For visualizing the results;
[0045] Figure 3 This is a schematic diagram of an abnormal fan temperature warning.
[0046] Figure 4 This is a structural diagram of Example 2. Detailed Implementation
[0047] The technical solutions in the embodiments of the present invention will be clearly and completely described below. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.
[0048] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without any creative effort.
[0049] Example 1
[0050] A method for early warning of abnormal fan temperature using dimensionality reduction and self-encoding, such as Figure 1 As shown, it includes the following steps:
[0051] S1, Data Acquisition and Feature Construction: Data on no downtime faults within a specified time period is acquired from the wind turbine SCADA system as the initial dataset. This initial data is then cleaned, preprocessed, and processed to obtain feature vectors, and a sample feature matrix is constructed. Specifically, this includes:
[0052] S1.1, temperature data of key components, including generator bearing temperature, gearbox oil temperature, pitch motor temperature, ambient temperature, wind speed, and power, are collected from the wind turbine SCADA system. Missing values and outliers are processed sequentially for each data category. The pre-processed monitoring data are aligned using the collection timestamp (by minute) to form a sample matrix. When obtaining the initial dataset from the wind turbine SCADA system, one turbine of the same type that has not experienced any shutdown failures in the past three months is selected, and its data under normal power generation status is used as the initial data. Missing values are filled using linear interpolation; outliers are detected and removed using box plots. That is, monitoring data for each category of monitoring quantity is processed sequentially. Make a judgment:
[0053]
[0054] Q1 and Q3 are the lower quartile and upper quartile, respectively; for The elements in the table represent the raw values of the monitoring data. This represents the result of processing and judging the monitoring data. (Monitoring data) It has P monitoring quantities, N is the time series length, p ranges from 1 to P, and n ranges from 1 to N. After processing, it forms... ;
[0055] The preprocessed monitoring data were aligned based on the collection timestamp (by minute) and then assembled into a sample matrix:
[0056]
[0057] in, p takes the value 1-P, representing the result of a certain monitoring quantity (such as active power) after preprocessing.
[0058] S1.2 adds three features: the rate of temperature change of each component, the temperature difference between each component and the ambient temperature, and the moving average of each temperature variable. The sample matrix and these three features are then merged to obtain the sample feature matrix. Three features are added:
[0059] Temperature change rate, calculate the temperature change rate of each component, i.e., the first-order difference.
[0060]
[0061] The key temperature difference is calculated by determining the temperature difference between each component and the ambient temperature.
[0062]
[0063] Temperature moving average: Calculate the moving average of each temperature variable (window size is...). )
[0064]
[0065] In the formula, For the rate of temperature change, For the data at time n of the p-th monitoring quantity, For the data at time n-1 of the p-th monitoring quantity, For the key temperature difference, This represents the value of the ambient temperature monitoring data at the nth moment. For temperature moving average, For window size, Values range from 1 to n;
[0066] Combine the original sample matrix and the features obtained from feature engineering ( , , By merging these components, the final sample feature matrix is obtained. .
[0067] S2, Feature Dimensionality Reduction and Manifold Construction Based on Manifold Learning: The Dt-SNE (Density-t-distributed Stochastic Neighbor Embedding) algorithm, which incorporates the correlation between wind turbine components, is used to reduce the dimensionality of the feature vectors obtained in S1, mapping the high-dimensional features to a low-dimensional manifold space to obtain a low-dimensional representation matrix; including:
[0068] S2.1, Each row of the sample feature matrix obtained in step S1 represents a sample point. A correlation matrix is constructed based on the physical connections and heat conduction relationships between the components of the wind turbine. Each element in the correlation matrix represents the correlation coefficient between any two components. The component weights are calculated based on the correlation matrix to obtain the overall correlation between the component represented by any sample point and other components. Specifically:
[0069] The sample feature matrix obtained in step S1 Each line Representing a sample point (all features at a given time point), this is an m-dimensional vector located in a high-dimensional feature space.
[0070] Based on the physical connections and heat conduction relationships of the various components of the wind turbine, a correlation matrix is constructed. , by element Composition, indicating components and components The correlation coefficients between them (range 0-1, 1 indicates perfect correlation, 0 indicates no correlation, u and v both take values of 1-Q, where Q represents the number of components). For each sample point Each of the monitoring quantities For a specific component f, component f represents the weight. :
[0071]
[0072] S2.2, Incorporate component weights into conditional probability calculations to obtain the probability that any two sample points in the sample feature matrix are adjacent; first, incorporate component weights into the sample points... and ( The first of the matrix row and number The formula for calculating the distance of a row is:
[0073]
[0074] Then, the conditional probability is calculated using the weighted distance formula described above. Calculation, conditional probability representation choose The probability of being its neighbor:
[0075]
[0076] in Therefore The standard deviation of a Gaussian distribution centered at . Representative with Centered on, with The standard deviation is the Gaussian kernel function value used to measure "proximity".
[0077] S2.3, initialize the low-dimensional matrix, where each row of the low-dimensional matrix represents a sample point, and use the t-distribution to calculate the similarity between any two sample points in the low-dimensional matrix to obtain the similarity probability;
[0078] Randomly initialize low-dimensional matrices ,set up each line Represents the coordinates of a sample point in a low-dimensional space, where This refers to the dimension of the low-dimensional space (typically 2 or 3). In this low-dimensional space, the t-distribution is used to compute the values of two data points. and Similarity between :
[0079]
[0080] in, yes and The square of the Euclidean distance between them. Represents all distinct pairs of data points ( and ,in The sum of similarities between points is calculated, and this sum acts as a normalization constant in the formula to ensure that the sum of similarities for all point pairs in the low-dimensional space is stable.
[0081] Finally, the low-dimensional representation is optimized by minimizing the KL divergence of the probability distributions in both the high-dimensional and low-dimensional spaces. :
[0082]
[0083] in, It is in higher-dimensional space and The joint probability, For the data points calculated earlier and The similarity between them can be determined by conditional probability. Symmetricization yields:
[0084]
[0085] In the formula: The meaning is that in a high-dimensional space, given Select point The conditional probability. The meaning is the Kullback-Leibler divergence (KL divergence) between the probability distribution P in the high-dimensional space and the probability distribution Q in the low-dimensional space. The meaning is the total number of data points.
[0086] S2.4, the probability of any two sample points being adjacent in the symmetric sample feature matrix is used to obtain the joint probability. The low-dimensional matrix is optimized by minimizing the KL divergence of the probability distributions of the joint probability and the similarity probability.
[0087] S2.5 uses gradient descent to iteratively optimize the low-dimensional matrix. In each iteration, the position of each sample point in the low-dimensional matrix is updated based on the gradient of the KL divergence with respect to the similarity between any two sample points, until the iteration stops, yielding the low-dimensional matrix. Gradient descent is used to optimize the low-dimensional matrix. The process is iterated, and each iteration is performed based on the KL divergence. Use the gradient to update each point Location:
[0088]
[0089] In the formula, The meaning is a pair of points in a low-dimensional space. and The square of the Euclidean distance between them.
[0090] The iteration stopping conditions include two factors: reaching the preset maximum number of iterations (1000 recommended); and the loss function (KL divergence) converging (rate of change less than 1%). Iteration stops when either condition is met. The formula for the rate of change of KL divergence is:
[0091]
[0092] Where S represents the number of iterations. The meaning is the value of the KL divergence calculated in the S-th iteration. The meaning is the value of the KL divergence calculated in the (S-1)th iteration.
[0093] Based on the dimensionality reduction after Dt-SNE It is mapped to a low-dimensional space to obtain a low-dimensional representation matrix. (Two-dimensional).
[0094] S3, Anomaly Detection Model Construction Based on Importance Weighted Autoencoder: The Importance Weighted Autoencoder (IWAE) model is trained using dimensionality-reduced data under normal operating conditions. The data is mapped to multiple latent variables, and the input data is reconstructed from these latent variables. The probability distribution of the data is learned by minimizing the IWAE's loss function, and the model is then trained. The trained model is used to calculate anomaly scores, and a threshold for the anomaly scores is determined based on test data. This includes:
[0095] S3.1 defines the structure of the importance-weighted autoencoder (IWAE), which maps a low-dimensional matrix to multiple latent variables, each of which follows a Gaussian distribution. The encoder outputs the mean and log-variance of the Gaussian distribution.
[0096] S3.2 reconstructs each latent variable into an output with the same dimensions as the low-dimensional matrix, defines a loss function between the low-dimensional matrix and network parameters, uses the negative log-likelihood as the outlier score, trains the model, calculates the outlier score using the trained model, and determines the threshold for the outlier score based on the test data.
[0097] We sample from the K output Gaussian distributions to obtain K latent variables:
[0098]
[0099] in This represents a diagonal covariance matrix, where the elements on the diagonal are... The square of. As latent variables, Indicates a normal distribution. The unit matrix represents the covariance matrix, which has a specific structure.
[0100] Each latent variable Each output is reconstructed to have the same dimension as the input y. Then training begins, with the loss function defined as follows:
[0101]
[0102] in, It is the posterior distribution defined by the encoder. The meaning is to sample K latent variables from the distribution. Take the expected value. The meaning is in the parameter Under the defined model, data points and the Sample of latent variables The joint probability. It refers to the loss function used to train the model in an importance-weighted autoencoder.
[0103] It is a joint probability distribution, which can be decomposed into
[0104]
[0105] It is the prior distribution, and the standard normal distribution is usually chosen. This is the conditional probability distribution defined by the decoder, usually assumed to be Gaussian. In actual computation, to avoid numerical underflow, the logarithmic form of the loss function is typically calculated:
[0106]
[0107] According to the decoder output The calculation is performed based on the assumed distribution (Gaussian distribution).
[0108]
[0109] in yes Dimensions It is the variance of the decoder output. The meaning is The Each dimension value and the decoder are based on latent variables. Output reconstructed data The squared difference between the values of the d-th dimension. The meaning is that the encoder is based on latent variables. The generated reconstructed data The Each dimension value.
[0110] It can be calculated based on the probability density function of the standard normal distribution:
[0111]
[0112] in It is the dimension of the latent variable.
[0113] The following can be calculated based on the mean and variance of the encoder output and the probability density function of the Gaussian distribution:
[0114]
[0115] in, and These are the first and second values of the mean and variance of the encoder output, respectively. One element;
[0116] Network parameters are trained using optimization algorithms such as Adam and SGD. (Decoder parameters) and (Encoder parameters). Typically, a small learning rate (e.g., 0.001 or 0.0001) and a large batch size (e.g., 64 or 128) are used.
[0117] Input the test data y into the trained model, and calculate the k reconstruction values and the average error:
[0118]
[0119] The meaning is test data The corresponding reconstructed value.
[0120] Calculate the negative log-likelihood of the data y as the outlier score:
[0121]
[0122] In actual calculations, the following is used:
[0123] Based on the abnormal score distribution of the test data, the 95th percentile was set as the threshold limit.
[0124] S4 collects real-time operating data from the wind turbine and inputs it into a trained model to calculate anomaly scores. If the anomaly score exceeds a set threshold, an alert is triggered. The trained Dt-SNE model then processes the real-time data. Mapping to a low-dimensional manifold space, we obtain Mapped data Input into the IWAE model:
[0125] The encoder calculates the mean of k groups. Sum of logarithmic variance .
[0126] from K latent variables were obtained by sampling from each sample.
[0127] Decoder according to each Reconstruct the output.
[0128] Calculate the outlier score:
[0129]
[0130] if This means that an alarm is triggered when the abnormal score exceeds the threshold calculated from the test data.
[0131] Example 2
[0132] A dimensionality reduction and self-encoding fan temperature anomaly early warning system, such as Figure 4 As shown, it includes:
[0133] The monitoring data processing module is used to obtain data on no downtime faults within a set time period from the wind turbine SCADA system as the initial dataset, clean, preprocess and feature-process the initial data to obtain feature vectors, and construct a sample feature matrix.
[0134] The manifold construction module, connected to the monitoring data processing module, is used to reduce the dimensionality of the feature vectors obtained in the monitoring data processing module by using the Dt-SNE algorithm, which combines the correlation of wind turbine components for weighting, and maps the high-dimensional features to the low-dimensional manifold space to obtain a low-dimensional representation matrix.
[0135] The autoencoder construction module, connected to the manifold construction module, is used to train an importance-weighted autoencoder model using dimensionality-reduced data under normal operating conditions, map the data to multiple latent variables, reconstruct the input data from the latent variables, learn the probability distribution of the data by minimizing the loss function of the importance-weighted autoencoder model, and then train the model. The trained model is then used to calculate the anomaly score and determine the threshold of the anomaly score.
[0136] The real-time early warning module, connected to the autoencoder construction module, is used to collect the operating data of the wind turbine in real time, input it into the trained model to calculate the anomaly score, and trigger an early warning if the anomaly score exceeds the set threshold.
[0137] Example 3
[0138] A dimensionality reduction and self-encoding wind turbine temperature anomaly early warning system includes a memory and one or more processors. The memory stores executable code, and when the one or more processors execute the executable code, they are used to implement the dimensionality reduction and self-encoding wind turbine temperature anomaly early warning method described in Embodiment 1.
[0139] Example 4
[0140] A computer-readable medium having a program stored thereon, which, when executed by a processor, implements the dimensionality reduction and self-encoding fan temperature anomaly early warning method described in Example 1.
[0141] Application examples
[0142] A power generation group in my country has deployed 74 wind turbine units with a total installed capacity of 301.2MW, along with one 220kV substation. Full-capacity grid connection was achieved in 2021. The complex and variable marine environment can easily affect the safety of equipment and structures. Currently, abnormal equipment temperatures are mainly determined manually. However, this method is affected by factors such as spare parts, shipping schedules, manpower, the number of turbines, and turbine type, resulting in a long anomaly detection cycle. Therefore, it is necessary to implement intelligent anomaly diagnosis for turbine temperatures to prevent serious accidents.
[0143] The wind turbine is equipped with monitoring systems related to unit temperature, including but not limited to generator bearing temperature monitoring, gearbox oil temperature monitoring, pitch motor temperature monitoring, ambient temperature monitoring, wind speed monitoring, and power monitoring.
[0144] Based on the aforementioned wind turbine temperature anomaly early warning method and system, the monitoring data processing module of module 10 was first used to obtain the key temperature parameters and operating parameters of the turbine from September 2005 to December 2005, as follows:
[0145] Ambient temperature is monitored by an air temperature probe outside the cabin, with a measurement accuracy of 0.1℃;
[0146] The generator bearing temperature is monitored using a PT100 resistance sensor with a measurement accuracy of 0.1℃.
[0147] The gearbox oil temperature is monitored by a PT100 resistance sensor with a measurement accuracy of 0.1℃.
[0148] The pitch motor temperature is monitored using a PT100 resistive sensor with a measurement accuracy of 0.1℃.
[0149] Wind speed was monitored using an ultrasonic anemometer with a measurement accuracy of 0.1 m / s.
[0150] Power is monitored via a power module with a measurement accuracy of 0.1 kW.
[0151] The above measurement data are combined based on the collection timestamp (accurate to the minute). Outliers are detected and removed using box plots, and the results are judged according to the formula in Example 1, resulting in a processed sample matrix of length N. .
[0152]
[0153] in, p = 1-6, representing ambient temperature, generator bearing temperature, gearbox oil temperature, pitch motor temperature, wind speed, and power, respectively. Three features are added:
[0154] The rate of temperature change is calculated separately for the generator bearing temperature, gearbox oil temperature, pitch motor temperature, and ambient temperature.
[0155]
[0156] Critical temperature difference: Calculate the temperature difference between each component and the ambient temperature.
[0157]
[0158] Temperature moving average: Calculate the moving average of each temperature variable. The window size can be empirically set to 10. Then:
[0159]
[0160] The final obtained initial sample feature matrix as follows:
[0161]
[0162] The preprocessed sample feature matrix Dimensionality reduction includes:
[0163] Set the parameters of the Dt-SNE model as follows: perplexity = 30, number of iterations = 1000, learning rate = 200.
[0164] Using the Dt-SNE model Mapping to a two-dimensional space yields a dimensionality-reduced data matrix. :
[0165]
[0166] Construct an importance-weighted autoencoder model using manifold dimensionality reduction data under normal operating conditions:
[0167] 1. Encoder setup: Input layer (2 nodes), two hidden layers (16 nodes and 8 nodes respectively), with ReLU activation function. Output layers are two parallel layers, each outputting the mean. Sum of logarithmic variance (Each has 4 nodes, corresponding to the dimensions of the latent variables).
[0168] 2. Decoder setup: Input layer (4 nodes), two hidden layers (8 nodes and 16 nodes respectively), activation function is ReLU. Output layer (2 nodes), activation function is a linear function.
[0169] 3. Set the number of samples to 5.
[0170] 4. Training parameters: Optimizer selected: Adam; learning rate: 0.001; epoch=100; batch_size=64
[0171] 5. After completing the training, calculate the training data. The negative log-likelihood is used as the outlier score, and the 95th percentile is set as the static threshold.
[0172] The new real-time SCADA data undergoes the same preprocessing and feature engineering as the training data to obtain... Part of its content is as follows:
[0173]
[0174] After all the data is visualized, as follows Figure 2 As shown, the horizontal axis of all curves is the data index, representing the relative position of the data in the time dimension, and the vertical axis of temperature-related curves is in °C. Figure 2 (1) is the ambient temperature curve; Figure 2 (2) is the generator bearing temperature curve; Figure 2 (3) is the gearbox oil temperature curve; Figure 2 (4) is the temperature curve of the pitch motor; Figure 2 (5) is a wind speed curve, with the vertical axis in m / s; Figure 2 (6) is the active power curve, with the vertical axis in KW; Figure 2 (7) is the generator bearing temperature delta curve (i.e., the generator bearing temperature change rate curve). Figure 2 (8) is the gearbox oil temperature delta curve (i.e., the gearbox oil temperature change rate curve). Figure 2 (9) is the pitch motor temperature delta curve (i.e., the pitch motor temperature change rate curve). Figure 2 (10) is the ambient temperature delta curve (i.e., the ambient temperature change rate curve). Figure 2 (11) is the generator bearing diff temperature curve (i.e., the temperature difference curve of the generator bearing temperature). Figure 2 (12) is the gearbox oil temperature_diff curve (i.e., the temperature difference curve of gearbox oil temperature). Figure 2 (13) is the pitch motor temperature-diff curve (i.e., the temperature difference curve of the pitch motor). Figure 2 (14) is the generator bearing temperature _avg curve (i.e., the moving average curve of generator bearing temperature). Figure 2 (15) is the gearbox oil temperature_avg curve (i.e., the moving average curve of gearbox oil temperature). Figure 2(16) is the pitch motor temperature _avg curve (i.e., the pitch motor temperature moving average curve). Figure 2 (17) is the ambient temperature _avg curve (i.e., the moving average curve of ambient temperature).
[0175] Using a trained Dt-SNE model to process real-time data Mapping to a low-dimensional manifold space, we obtain Mapped data Input into the IWAE model and calculate the anomaly score, if This means that an alarm is triggered when the abnormal score exceeds the threshold calculated from the test data. (See below.) Figure 3 The result shown is the calculation of the example. The trend is shown in the red dashed line, which represents the abnormal score threshold given by the training model. It can be seen that the overall temperature of the unit has been deteriorating recently, especially around 200-250 degrees Celsius, where the abnormal score significantly exceeds the threshold, which requires attention.
[0176] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the invention can be implemented in other specific forms without departing from its spirit or essential characteristics. Therefore, the embodiments should be considered in all respects as exemplary and non-limiting, and the scope of the invention is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention. No reference numerals in the claims should be construed as limiting the scope of the claims.
Claims
1. A method for early warning of abnormal fan temperature using dimensionality reduction and self-encoding, characterized in that, Includes the following steps: S1. Obtain data without downtime faults within a set time period as the initial dataset. Clean, preprocess, and process the initial data to obtain feature vectors. Construct a sample feature matrix using the feature vectors. Step S1 specifically includes: S1.1 Collect temperature data of key components, including generator bearing temperature, gearbox oil temperature, pitch motor temperature, ambient temperature, wind speed, and power. Process each type of data in turn for missing value processing and outlier processing. Align the preprocessed monitoring data with the collection timestamp to form a sample matrix. S1.2, add three features: the rate of change of temperature of each component, the temperature difference between each component and the ambient temperature, and the moving average of each temperature variable, and merge the sample matrix and the three features to obtain the sample feature matrix; S2, using the Dt-SNE algorithm weighted by incorporating the correlation of wind turbine components to reduce the dimensionality of the feature vectors obtained in S1, mapping the high-dimensional features to a low-dimensional manifold space to obtain a low-dimensional representation matrix; step S2 includes: S2.1, Each row of the sample feature matrix obtained in step S1 represents a sample point. A correlation matrix is constructed based on the physical connection and heat conduction relationship between the components of the wind turbine. Each element in the correlation matrix represents the correlation coefficient between any two components. The component weight is calculated based on the correlation matrix to obtain the overall correlation between the component represented by any sample point and other components. S2.2, Incorporate component weights into conditional probability calculations to obtain the probability that any two sample points in the sample feature matrix are adjacent; S2.3, initialize the low-dimensional matrix, where each row of the low-dimensional matrix represents a sample point, and use the t-distribution to calculate the similarity between any two sample points in the low-dimensional matrix to obtain the similarity probability; S2.4, the probability of any two sample points being adjacent in the symmetric sample feature matrix is used to obtain the joint probability. The low-dimensional matrix is optimized by minimizing the KL divergence of the probability distributions of the joint probability and the similarity probability. S2.5 Iteratively optimize the low-dimensional matrix using gradient descent. In each iteration, the position of each sample point in the low-dimensional matrix is updated based on the gradient of the KL divergence with respect to the similarity between any two sample points until the iteration stops, thus obtaining the low-dimensional matrix. S3 uses the results of S2 to train an importance-weighted autoencoder model, maps the data to multiple latent variables, reconstructs the input data from the latent variables, learns the probability distribution of the data by minimizing the loss function of the importance-weighted autoencoder model, and then trains the model. The trained model is then used to calculate the anomaly score and determine the threshold of the anomaly score. S4 collects real-time operating data of the wind turbine and inputs it into the trained model to calculate the anomaly score. If the anomaly score exceeds the set threshold, an early warning is triggered.
2. The fan temperature anomaly early warning method based on dimensionality reduction and self-encoding according to claim 1, characterized in that, In step S1, missing values in the initial dataset are filled using linear interpolation; outliers in the initial dataset are detected and removed using box plots.
3. The fan temperature anomaly early warning method based on dimensionality reduction and self-encoding according to claim 1, characterized in that, Step S3 includes: S3.1 defines the structure of an importance-weighted autoencoder, which maps a low-dimensional matrix to multiple latent variables, each of which follows a Gaussian distribution. The encoder outputs the mean and log-variance of the Gaussian distribution. S3.2 reconstructs each latent variable into an output with the same dimensions as the low-dimensional matrix, defines the loss function between the low-dimensional matrix and the network parameters, uses the negative log-likelihood as the outlier score, trains the model, calculates the outlier score using the trained model, and determines the threshold for the outlier score.
4. The fan temperature anomaly early warning method based on dimensionality reduction and self-encoding according to claim 3, characterized in that, The expression for abnormal scores is: In the formula: K is the number of latent variables obtained after dimensionality reduction mapping. This represents the conditional probability distribution defined by the decoder. Describe the prior distribution, This represents the posterior distribution defined by the encoder.
5. A fan temperature anomaly early warning system with dimensionality reduction and self-encoding, characterized in that, include: The monitoring data processing module is used to acquire data without downtime faults within a set time period as the initial dataset, clean, preprocess and feature-process the initial data to obtain feature vectors, and construct a sample feature matrix; Specifically, it includes: Temperature data of key components are collected, including generator bearing temperature, gearbox oil temperature, pitch motor temperature, ambient temperature, wind speed, and power. Missing values and outliers of each type of data are processed in sequence. The preprocessed monitoring data are aligned according to the collection timestamp to form a sample matrix. Three features are added: the rate of temperature change of each component, the temperature difference between each component and the ambient temperature, and the moving average of each temperature variable. The sample matrix and the three features are then combined to obtain the sample feature matrix. A manifold construction module, connected to the monitoring data processing module, is used to reduce the dimensionality of the feature vectors obtained in the monitoring data processing module using the Dt-SNE algorithm, which weights the features based on the correlation of wind turbine components, mapping the high-dimensional features to a low-dimensional manifold space to obtain a low-dimensional representation matrix; including: Each row of the sample feature matrix represents a sample point. A correlation matrix is constructed based on the physical connection and heat conduction relationship between the components of the wind turbine. Each element in the correlation matrix represents the correlation coefficient between any two components. The component weights are calculated based on the correlation matrix to obtain the overall correlation between the component represented by any sample point and other components. By incorporating component weights into conditional probability calculations, the probability that any two sample points in the sample feature matrix are adjacent is obtained. Initialize a low-dimensional matrix, where each row of the low-dimensional matrix represents a sample point. Use the t-distribution to calculate the similarity between any two sample points in the low-dimensional matrix and obtain the similarity probability. The probability of any two sample points being adjacent in the symmetric sample feature matrix is obtained as the joint probability. The low-dimensional matrix is optimized by minimizing the KL divergence of the probability distributions of the joint probability and the similarity probability. The low-dimensional matrix is optimized iteratively using gradient descent. In each iteration, the position of each sample point in the low-dimensional matrix is updated based on the gradient of the KL divergence with respect to the similarity between any two sample points, until the iteration stops, and the low-dimensional matrix is obtained. The autoencoder construction module, connected to the manifold construction module, is used to train an importance-weighted autoencoder model using dimensionality-reduced data under normal operating conditions, map the data to multiple latent variables, reconstruct the input data from the latent variables, learn the probability distribution of the data by minimizing the loss function of the importance-weighted autoencoder model, and then train the model. The trained model is then used to calculate the anomaly score and determine the threshold of the anomaly score. The real-time early warning module, connected to the autoencoder construction module, is used to collect the operating data of the wind turbine in real time, input it into the trained model to calculate the anomaly score, and trigger an early warning if the anomaly score exceeds the set threshold.
6. A fan temperature anomaly early warning system with dimensionality reduction and self-encoding, characterized in that, The device includes a memory and one or more processors, wherein the memory stores executable code, and the one or more processors execute the executable code to implement the dimensionality reduction and self-encoding wind turbine temperature anomaly early warning method according to any one of claims 1 to 4.
7. A computer-readable medium having a program stored thereon, which, when executed by a processor, implements the dimensionality reduction and self-encoding fan temperature anomaly early warning method according to any one of claims 1 to 4.