A fan yaw data anomaly detection method under long-time data missing characteristics
By using the SEF-Prophet model to perform bidirectional completion of wind turbine yaw data and automatic encoder anomaly detection, the accuracy problem of wind turbine yaw data anomaly detection under long-term data loss was solved, and high-precision anomaly detection was achieved to ensure the stable operation of wind turbine units.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HEBEI UNIV OF TECH
- Filing Date
- 2023-07-28
- Publication Date
- 2026-06-19
Smart Images

Figure CN116717438B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of wind turbine yaw data anomaly detection technology, and in particular relates to a method for detecting wind turbine yaw data anomalies under the characteristic of long-term data loss. Background Technology
[0002] Wind energy, as a clean and renewable energy source, is an important energy source for achieving sustainable development. Wind power generation converts wind energy into mechanical energy, and then mechanical energy into electrical energy. The stable operation of wind turbine units is a crucial prerequisite for ensuring the system's power generation efficiency. The yaw system is a unique component of wind turbine units. Yaw system data can largely reflect the operating status of the entire wind turbine unit. When yaw data is abnormal, it is generally believed that the entire system will experience abnormalities or malfunctions. Therefore, detecting anomalies in yaw data can ensure the stable operation of wind turbine units.
[0003] During wind turbine operation, the SCADA system continuously collects time-series data of various modes and then uses this data for wind turbine anomaly detection. During data acquisition, due to factors such as SCADA system stability and signal transmission, data gaps are inevitable. Data preprocessing requires filling in these missing data. While a small amount of missing data can be filled using conventional methods such as interpolation, missing data due to planned shutdowns, database stoppages and restarts, server power outages, and buffer overflows constitutes a significant portion of the entire sequence. Using conventional methods to fill in these gaps ignores the overall trend of the missing segments, resulting in poor feature representation by the neural network model and reduced detection accuracy. Therefore, this invention addresses the characteristic of long-term data gaps by proposing a wind turbine yaw data anomaly detection method. Summary of the Invention
[0004] To address the shortcomings of existing technologies, the technical problem this invention aims to solve is to provide a method for detecting abnormal yaw data of wind turbines under conditions of long-term data loss.
[0005] The present invention solves the aforementioned technical problem by adopting the following technical solution:
[0006] A method for detecting anomalies in wind turbine yaw data under long-term data loss characteristics, characterized by including the following:
[0007] Collect yaw data sequences during wind turbine shutdown and maintenance, extract yaw data from the collected yaw data sequences, and obtain preprocessed yaw data sequences;
[0008] The missing segment is divided into two parts, with lengths of the two parts as follows:
[0009]
[0010] n = P – m(2)
[0011] Where m is the length of the data segment before the missing segment, n is the length of the data segment after the missing segment, M and N are the lengths of the known data before and after the missing segment, and P is the length of the missing segment.
[0012] The SEF-Prophet model is constructed based on the Prophet model. The SEF-Prophet model includes three models: S-Prophet, E-Prophet, and F-Prophet. The S-Prophet model is used to predict the first part of the missing data, the E-Prophet model is used to predict the second part of the missing data, and the F-Prophet model is used for overall fitting of the missing data.
[0013] The missing segments obtained by fitting are concatenated with the known data before and after the missing segments to obtain the completed yaw data sequence.
[0014] An anomaly detection model is built based on an autoencoder. The completed yaw data sequence is input into the anomaly detection model for reconstruction, and the reconstruction error is calculated. If the reconstruction error is greater than or equal to a set threshold, it is considered an anomaly; otherwise, it is considered normal.
[0015] Furthermore, the input to the S-Prophet model consists of M known data points before the missing segment, with a predicted data length of m; the input to the E-Prophet model consists of N known data points after the missing segment, input in reverse order, with a predicted data length of N; and the input to the F-Prophet model consists of two concatenated data points of the missing segment, with a fitted data length of P.
[0016] Compared with the prior art, the beneficial effects of the present invention are:
[0017] 1. This invention addresses yaw data with long missing segments by employing a bidirectional completion strategy. The missing segment is divided into two segments of unequal length. The data before the missing segment exhibits higher autocorrelation with known data preceding it, while the data after the missing segment shows higher autocorrelation with known data following it. This ensures consistency in trend and periodicity between the missing segment and the entire sequence. The Prophet model is used to predict the two segments of the missing segment. Known data preceding the missing segment is input into the S-Prophet model to predict the preceding segment, while known data following the missing segment is input in reverse order into the E-Prophet model to predict the following segment, thus completing the missing data completion. To eliminate the low-fitting nature caused by trend evolution during segment splicing, the two predicted segments are input into the F-Prophet model to predict the entire missing segment.
[0018] 2. The supplemented yaw data is input into the automatic encoder for anomaly detection, improving prediction accuracy. This method is well-suited to abnormal time-series data, enabling not only point anomaly detection but also anomaly interval detection of varying lengths, thus ensuring the normal and stable operation of wind turbine units. Attached Figure Description
[0019] Figure 1 This is an overall flowchart of the present invention;
[0020] Figure 2 This is a diagram illustrating the division of two data segments with missing data.
[0021] Figure 3 A diagram illustrating the process of completing missing segments;
[0022] Figure 4 The diagram shows the temperature variation of the generator stator windings before the missing parts were added.
[0023] Figure 5 The diagram shows the temperature variation of the generator stator windings after the missing parts were added.
[0024] Figure 6 This is a schematic diagram illustrating data trends and cyclical changes based on the Prophet model;
[0025] Figure 7 This is a diagram showing the reconstruction error distribution during the anomaly detection process. Detailed Implementation
[0026] Specific embodiments are given below with reference to the accompanying drawings. These specific embodiments are only used to further illustrate the technical solution of the present invention and do not limit the scope of protection of this application.
[0027] This invention relates to a method for detecting anomalies in wind turbine yaw data under conditions of long-term data loss (hereinafter referred to as the method, see [link]). Figures 1-7 The process includes the following steps:
[0028] Step 1: Install various types of sensors on the wind turbine and collect yaw data of the wind turbine during shutdown maintenance through the SCADA system as the raw yaw data sequence; select data points at certain time intervals, extract yaw data from the raw yaw data sequence, and obtain the preprocessed yaw data sequence.
[0029] This embodiment collects a total of 37 types of yaw data, including wind speed, wind turbine speed, blade pitch angle, nacelle rotational acceleration, wind direction, yaw angle, generator output power, generator current, generator operating frequency, generator torque, wind turbine azimuth angle, converter current, converter voltage, converter temperature, hub temperature, main bearing temperature, yaw motor power, and hydraulic braking force.
[0030] Because SCADA systems collect data at the second level, the amount of data collected is very large, which leads to a long detection time. Since the data does not change much in a short period of time, data is generally extracted at 10-minute intervals. Although SCADA systems collect data continuously, downtime for maintenance can lead to long periods of continuous data loss. The proportion of missing data in the entire sequence is relatively large. When the proportion of missing data is greater than or equal to 8%, it meets the characteristics of long-term data loss.
[0031] Step 2: Construct the SEF-Prophet model based on the Prophet model, and use the SEF-Prophet model to fill in the missing segments of the yaw data sequence to obtain the completed yaw data sequence.
[0032] Assume that any preprocessed yaw data sequence has a total of S data points, of which M are known data points before the missing segment, P are known data points in the missing segment, and N are known data points after the missing segment. The P data points in the missing segment need to be filled in.
[0033] Since the autocorrelation of data with similar time intervals depends on the characteristics of the data and the observation interval, generally, data with closer time intervals have higher autocorrelation. Therefore, to more accurately predict missing data, the missing segment is divided into two segments. The lengths of the two segments are related to the position of the missing segment in the entire sequence. It is expected that the data segment preceding the missing segment has higher autocorrelation with the data preceding the missing segment, and the data segment following the missing segment has higher autocorrelation with the data following the missing segment. Therefore, the lengths of the two segments of the missing segment are as follows:
[0034]
[0035] n=P–m (2)
[0036] Where m is the length of the data segment before the missing segment, and n is the length of the data segment after the missing segment;
[0037] The SEF-Prophet model comprises three Prophet models, denoted as S-Prophet, E-Prophet, and F-Prophet. The S-Prophet model predicts the preceding segment of the missing data. Its input consists of M data points preceding the missing segment, with a prediction data length of m. The output is the preceding segment of the missing data. The E-Prophet model predicts the following segment of the missing data. N data points following the missing segment are input in reverse order, with a prediction data length of n. The output is the following segment of the missing data. Both the S-Prophet and E-Prophet models have 5 variables, and the variable range is global.
[0038] The two predicted data segments are concatenated. To eliminate the low fit caused by the evolution of trend terms during the concatenation process, the F-Prophet model is used to fit the missing segments obtained by concatenation. The length of the data fitting is set to P. The fitted missing segments are then concatenated with the M data segments before the missing segments and the N data segments after the missing segments to obtain the completed yaw data sequence.
[0039] Step 3: Construct an anomaly detection model based on an autoencoder. The anomaly detection model includes an encoder and a decoder. The encoder consists of one input layer and two encoding layers, while the decoder consists of two decoding layers and one output layer. Both the encoding and decoding layers are constructed using LSTM. The encoder encodes the input to obtain encoded features. The encoder completes the encoding of the input through linear mapping and nonlinear activation. The calculation formula is as follows:
[0040] H = g(W) m X+b m (3)
[0041] In the formula, H represents the coding feature, X represents the input yaw data sequence, and W represents the input yaw data sequence. m Let b be the weight matrix of the coding layer. m Here, g is the node bias of the coding layer, and g(.) is the node activation function.
[0042] The decoder completes the decoding of the encoded features and reconstructs the input; the decoder's calculation formula is:
[0043]
[0044] In the formula, For the predicted yaw data sequence, W d Let b be the weight matrix of the decoding layer. d For node bias in the decoding layer;
[0045] Step 4: Train the anomaly detection model using the completed yaw data sequence; calculate the model loss using the loss function, and use the SGD gradient descent algorithm to backpropagate the model loss to adjust the parameters of the anomaly detection model until the model loss reaches its minimum value.
[0046] The loss function can be the mean squared error loss function, the squared error loss function, or the cross-entropy loss function. These three functions are expressed as follows:
[0047]
[0048]
[0049]
[0050] In the formula, Represents the yaw data sequence X and The loss between, x i , Let i represent the i-th time series data and its predicted value, and k be the length of the yaw data sequence;
[0051] Assuming the learning rate is η, the update formulas for the weight matrix and node bias of the encoding layer of the anomaly detection model are shown in equations (7) and (8), and the update formulas for the weight matrix and node bias of the decoder layer are similar.
[0052]
[0053]
[0054] In the formula, These are the updated weight matrix and node bias of the decoding layer, J(W) m ,b m ) represents the update error of the weight matrix and node bias;
[0055] The reconstruction error is calculated using equation (9):
[0056]
[0057] If the reconstruction error is greater than or equal to the set threshold, it is considered abnormal; otherwise, it is considered normal. When k=1, it is a point anomaly detection; when k>1, it is an interval anomaly detection.
[0058] Any aspects not covered in this invention are applicable to existing technologies.
Claims
1. A method for detecting anomalies in wind turbine yaw data under long-term data loss characteristics, characterized in that, Includes the following steps: Collect yaw data sequences during wind turbine shutdown and maintenance, extract yaw data from the collected yaw data sequences, and obtain preprocessed yaw data sequences; The missing segment is divided into two parts, with lengths of the two parts as follows: (1) (2) in, The length of the data segment preceding the missing segment. The length of the data segment following the missing segment. , These represent the known data lengths located before and after the missing segment, respectively. The length of the missing data segment; The SEF-Prophet model is constructed based on the Prophet model. The SEF-Prophet model includes three sub-models: S-Prophet, E-Prophet, and F-Prophet. The S-Prophet model predicts the preceding segment of the missing data, the E-Prophet model predicts the following segment, and the F-Prophet model performs an overall fit to the missing data. The input to the S-Prophet model is the data preceding the missing segment. Given data, the length of the predicted data is [number]. ; located after the missing segment The known data points are input into the E-Prophet model in reverse order, and the length of the predicted data is [length missing]. The two data segments with missing fragments were concatenated and then input into the F-Prophet model for fitting. The length of the fitted data was [length missing]. ; The missing segments obtained by fitting are concatenated with the known data before and after the missing segments to obtain the completed yaw data sequence. An anomaly detection model is built based on an autoencoder. The completed yaw data sequence is input into the anomaly detection model for reconstruction, and the reconstruction error is calculated. If the reconstruction error is greater than or equal to a set threshold, it is considered an anomaly; otherwise, it is considered normal. If the length of the missing data segment accounts for more than or equal to 8% of the length of the preprocessed yaw data sequence, then the long-term data missing characteristic is satisfied.