A method and device for classifying wind speed fluctuation processes based on a sliding window
By combining sliding window and swing window algorithms with wavelet analysis and k-means clustering, the classification of wind speed fluctuation processes is optimized, overcoming the shortcomings of static window and traditional clustering methods. This achieves more accurate analysis and prediction of wind speed fluctuation processes, improving the accuracy of wind power prediction and the stability of the power system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA THREE GORGES CORPORATION
- Filing Date
- 2022-09-07
- Publication Date
- 2026-06-19
AI Technical Summary
In the segmentation of wind speed fluctuations, existing technologies lack reasonable indicators for static and other long-term windows, making it impossible to accurately determine the end position of the fluctuation segment, which affects the accuracy and effectiveness of subsequent work. Furthermore, traditional clustering methods do not consider time series similarity, resulting in unsatisfactory classification results.
A sliding window-based approach, combined with the swing window algorithm and wavelet analysis, is used to denoise the wind speed sequence. The wind speed fluctuation segment is divided by the swing window, and the k-means clustering method is used for primary and secondary clustering. The clustering effect is optimized based on feature values and time series similarity metrics.
It improves the classification accuracy of wind speed fluctuation processes, simplifies the prediction model, enhances the real-time prediction accuracy of wind power, and ensures the safe and stable operation of the power system in scenarios with a high proportion of renewable energy.
Smart Images

Figure CN115564184B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of wind turbine power prediction and wind speed fluctuation process analysis, specifically to a method and apparatus for classifying wind speed fluctuation processes based on a sliding window. Background Technology
[0002] With a high proportion of wind power integrated into the power grid, predicting wind speed conditions over a future period allows for proactive measures. These measures include determining turbine operating modes in advance, developing power generation and dispatch plans, ensuring power supply and demand balance, reducing grid spinning reserve capacity and generation costs, and mitigating the uncertainties brought about by large-scale wind power grid integration to some extent. Wind speed fluctuations exhibit a certain degree of weak periodicity. By dividing the wind speed fluctuation process and revealing its patterns, prediction models can be simplified, effectively improving the accuracy of real-time wind power prediction and thus ensuring the safe and stable operation of the power system in scenarios with a high proportion of renewable energy.
[0003] Currently, domestic research on the division of fluctuation processes is limited. Some studies are limited to static fixed-scale windows, but wind speed fluctuation processes are often not of equal length. Static equal-length time windows often lack reasonable indicators to determine the end position of the fluctuation segment. Too small a time window cannot clearly reflect a complete fluctuation process, while too large a time window will result in a loss of division accuracy, which is not conducive to subsequent work.
[0004] In recent years, a method for identifying wind speed fluctuations has emerged, employing a swing window algorithm to segment wind power prediction data. This method effectively divides wind power data into fluctuation processes, thus enabling estimation of maximum error. However, the swing window algorithm can only divide a complete wind speed sequence into several wind speed fluctuation processes; it cannot further classify and analyze these fluctuation processes based on features or time series.
[0005] Based on the use of swing windows to segment fluctuation processes, similar fluctuation processes are clustered according to their features to obtain classification results. Traditional clustering methods often target hard or static data, meaning the feature values of the data do not change over time. Time series, on the other hand, are sequences of data changes recorded in chronological order. Therefore, time series data are not static but dynamic data. Traditional clustering methods do not consider the impact of time series similarity on clustering results, leading to less than ideal classification outcomes. Summary of the Invention
[0006] The purpose of this invention is to overcome the shortcomings of existing technologies and propose a method and apparatus for classifying wind speed fluctuation processes based on a sliding window. This invention comprehensively considers the influence of time series similarity and fluctuation process characteristics, performing secondary clustering on the fluctuation processes to optimize the clustering effect and obtain more accurate wind speed fluctuation classification results.
[0007] A first aspect of this invention proposes a method for classifying wind speed fluctuation processes based on a sliding window, comprising:
[0008] Obtain the original wind speed sequence and perform noise reduction;
[0009] The denoised wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm;
[0010] The wind speed fluctuation segment is converted into an equal-length wind speed time series, and a first-order clustering based on feature values is performed on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series.
[0011] Based on the preliminary classification results, a secondary clustering based on time series similarity is performed on the equal-length wind speed time series under each category to obtain the final classification result of the wind speed series.
[0012] In one specific embodiment of the present invention, the noise reduction employs wavelet analysis.
[0013] In a specific embodiment of the present invention, both the first-order clustering and the second-order clustering adopt the k-means clustering method, and the optimal number of clusters is determined by the elbow method based on the sum of squared errors.
[0014] In a specific embodiment of the present invention, the denoised wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm, including:
[0015] 1) Take the initial moment of the denoised wind speed sequence as the initial moment of the first wind speed fluctuation segment;
[0016] 2) Take the first wind speed fluctuation segment as the current wind speed fluctuation segment, record the initial time of the current wind speed fluctuation segment as t=0, and record the initial wind speed of the current wind speed fluctuation segment as v0.
[0017] 3) Calculate the swing window for the current wind speed fluctuation segment, as shown in the following expression:
[0018]
[0019] In the formula, S u For the upper swing window; S d ε is the width of the swing window; i is the time number of the current wind speed fluctuation segment, v(i) is the wind speed at time i; t is the time number of the last moment of the current wind speed fluctuation segment.
[0020] Starting from t=0, calculate the upper and lower swing windows respectively, and take the value that satisfies S u ≥S d The minimum time t pt represents the end time of the current wind speed fluctuation segment. p The following conditions must be met:
[0021]
[0022] 4) t p To determine the initial moment of the next current wind speed fluctuation segment, let t p The wind speed at each moment is used as the updated v0, and then the process returns to step 3) to continue identifying the next fluctuation segment; until the wind speed at all moments in the denoised wind speed sequence is divided into the corresponding wind speed fluctuation segment, and the division is completed.
[0023] In a specific embodiment of the present invention, the step of converting the wind speed fluctuation segment into a wind speed time series of equal length, and performing a first-order clustering based on feature values on the wind speed time series to obtain a preliminary classification result of the wind speed series includes:
[0024] 1) Convert each wind speed fluctuation segment into a wind speed time series of equal length;
[0025] 2) Based on the feature values of the selected time series wind speed data, calculate the feature values of each wind speed time series of equal length to form an feature value matrix.
[0026] 3) Perform a clustering operation on the eigenvalue matrix to obtain the clustering results of each wind speed fluctuation segment as the preliminary classification result of the wind speed sequence.
[0027] In a specific embodiment of the present invention, the step of performing secondary clustering based on time series similarity metric on the equal-length wind speed time series under each category according to the preliminary classification result to obtain the final classification result of the wind speed series includes:
[0028] 1) For any wind speed fluctuation classification obtained after one clustering, each equal-length wind speed time series under that classification is taken as a sample. Calculate three distance indicators among the samples under that classification: absolute distance, speed-up distance, and fluctuation distance, as follows:
[0029] 1-1) Absolute distance;
[0030] The absolute distance between sample i and sample j is:
[0031]
[0032] In the formula, x itkIt represents the value of the k-th feature of the i-th sample at time t, where i = 1, 2, ..., N; t = 1, 2, ..., T; k = 1, 2, ..., n; N is the number of samples under this wind speed fluctuation classification; n is the number of feature values for each sample; and T is the length of the time series for each sample.
[0033] 1-2) Growth rate gap;
[0034] The growth rate distance between sample i and sample j is:
[0035]
[0036] In the formula, Δx itk =x itk -x it-1k Δx represents the absolute increment of the k-th eigenvalue of the i-th element at adjacent times [t-1, t]. itk / x it-1k The value represents the relative increment of the k-th feature of the i-th sample at adjacent times [t-1, t].
[0037] 1-3) Fluctuation distance;
[0038] The fluctuation distance between sample i and sample j is:
[0039]
[0040] in:
[0041]
[0042]
[0043]
[0044] In the formula, S ik c represents the mean and standard deviation of the k-th feature value of the i-th sample over period T, respectively; ik This represents the ratio of the standard deviation to the mean of the k-th feature value of the i-th sample;
[0045] 2) Determine the weights of each distance indicator under this wind speed fluctuation category; the specific steps are as follows:
[0046] 2-1) Standardize each distance indicator;
[0047] Let X be the distance index value corresponding to the j-th sample under any wind speed fluctuation category. ij ;
[0048] For X ij The value after data standardization is Y ij:
[0049]
[0050] 2-2) Calculate the information entropy of the distance index;
[0051] For any wind speed fluctuation category, the information entropy calculation expression for the j-th distance index is as follows:
[0052]
[0053] in,
[0054]
[0055] The information entropy of each distance indicator under this fluctuation category is obtained as follows:
[0056] E1, E2, ..., E r
[0057] In the formula, r is the number of distance indicators;
[0058] 2-3) Determine the weights of each distance indicator:
[0059]
[0060] 3) Calculate the overall distance between samples under this wind speed fluctuation category based on the weights;
[0061] The expression for calculating the combined distance between the i-th sample and the j-th sample is as follows:
[0062] d ij,CED =w1*z(di j,AQED )+w2*z(d ij,ISED )+w1*z(d ij,VCED )
[0063] In the formula, z(d ij,AQED ), z(d ij,ISED ), z(d ij,VCED ) are respectively d ij,AQED d ij,ISED d ij,VCED The value after standardization transformation; w1, w2, and w3 are the weights of each distance indicator, and w1 + w2 + w3 = 1;
[0064] 4) Based on the comprehensive distance between samples, perform secondary clustering on samples under the same wind speed fluctuation category to obtain the final classification result of the wind speed sequence.
[0065] In a specific embodiment of the present invention, the feature values include: mean, variance, fluctuation segment length, maximum wind speed, minimum wind speed, maximum value location, minimum value location, difference between maximum and minimum values, positive fluctuation duration, and negative fluctuation duration.
[0066] A second aspect of the present invention provides a wind speed fluctuation process classification device based on a sliding window, comprising:
[0067] The wind speed sequence acquisition module is used to acquire the original wind speed sequence and perform noise reduction.
[0068] The wind speed fluctuation segment division module is used to divide the noise-reduced wind speed sequence into multiple wind speed fluctuation segments using a swing window algorithm.
[0069] A primary clustering module is used to convert the wind speed fluctuation segment into an equal-length wind speed time series, and to perform primary clustering based on feature values on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series.
[0070] The secondary clustering module is used to perform secondary clustering based on time series similarity metric on the equal-length wind speed time series under each category according to the preliminary classification results, so as to obtain the final classification result of the wind speed series.
[0071] A third aspect of the present invention provides an electronic device comprising:
[0072] At least one processor; and a memory communicatively connected to said at least one processor;
[0073] The memory stores instructions that can be executed by the at least one processor, the instructions being configured to perform a sliding window-based wind speed fluctuation process classification method.
[0074] A fourth aspect of the present invention provides a computer-readable storage medium storing computer instructions for causing the computer to execute the above-described sliding window-based wind speed fluctuation process classification method.
[0075] The features and beneficial effects of this invention are as follows:
[0076] This invention employs wavelet analysis to denoise wind speed time series data and extract the main fluctuation trends. Then, a swing window algorithm is used to segment the wind speed fluctuation process. For each segmented wind speed fluctuation segment, k-means clustering is performed secondarily based on eigenvalues and time series similarity metrics. Clustering evaluation metrics are then used to evaluate and analyze the results of both clustering methods. The second clustering approach further refines the clustering results, yielding more accurate clustering and ultimately a better classification result for the wind speed fluctuation process.
[0077] This invention provides valuable guidance for revealing the patterns of wind speed fluctuations, simplifying prediction models, and effectively improving the real-time prediction accuracy of wind power, thereby ensuring the safe and stable operation of power systems in scenarios with a high proportion of renewable energy. Attached Figure Description
[0078] Figure 1 A flowchart of an overall method for classifying wind speed fluctuation processes based on a sliding window according to an embodiment of the present invention;
[0079] Figure 2 This is a schematic diagram of the wave process division result by the swing window in a specific embodiment of the present invention;
[0080] Figure 3 This is a schematic diagram illustrating the optimal number of clusters for a single-stage clustering based on feature values in a specific embodiment of the present invention;
[0081] Figure 4 This is a schematic diagram of a single clustering result based on feature values in a specific embodiment of the present invention;
[0082] Figure 5 This is a schematic diagram illustrating the optimal number of clusters for quadratic clustering based on time series similarity in a specific embodiment of the present invention;
[0083] Figure 6 This is a schematic diagram of the secondary clustering results based on time series similarity in a specific embodiment of the present invention. Detailed Implementation
[0084] This invention proposes a method and apparatus for classifying wind speed fluctuation processes based on a sliding window, which will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0085] A first aspect of this invention proposes a method for classifying wind speed fluctuation processes based on a sliding window, comprising:
[0086] Obtain the original wind speed sequence and perform noise reduction;
[0087] The denoised wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm;
[0088] The wind speed fluctuation segment is converted into an equal-length wind speed time series, and a first-order clustering based on feature values is performed on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series.
[0089] Based on the preliminary classification results, a secondary clustering based on time series similarity is performed on the equal-length wind speed time series under each category to obtain the final classification result of the wind speed series.
[0090] In a specific embodiment of the present invention, the overall process of the wind speed fluctuation process classification method based on a sliding window is as follows: Figure 1 As shown, it includes the following steps:
[0091] 1) Obtain the original wind speed sequence:
[0092] In this embodiment, the length of the original wind speed sequence can be greater than or equal to one year. In a specific embodiment of the present invention, wind speed SCADA data is collected as the original wind speed sequence, with a time resolution of 1 hour and a time length of one year.
[0093] 2) The original wind speed sequence obtained in step 1) is denoised to obtain the denoised wind speed sequence:
[0094] In one embodiment of the present invention, the DB4 wavelet is selected to decompose and denoise the original wind speed sequence data, and the decomposition scale is set to 4. The selection of the decomposition scale is determined by the frequency range of the required wavelet decomposition. If the decomposition scale is set to p, the range of each frequency band is as shown in equation (1):
[0095]
[0096] Among them, F s The sampling frequency.
[0097] In this embodiment, for each decomposition scale, the corresponding high-frequency wavelet coefficients (performing wavelet transform on a given signal involves expanding the signal according to a set cluster of wavelet functions, representing the signal as a linear combination of wavelet functions with different scales and time shifts; the coefficient of each term is called a wavelet coefficient; high-frequency wavelet coefficients typically use the concept of DWT in the Malat algorithm, decomposing the information in the signal into high-frequency detail information through a high-frequency bandpass filter), a suitable threshold is selected for quantization. Currently, there are mainly unbiased risk estimation thresholds, minima thresholds, fixed thresholds, heuristic thresholds, etc., each with its own characteristics, and a suitable threshold can be selected according to the specific requirements of denoising. Based on this, the original signal (in this embodiment, the original wind speed sequence data) is reconstructed, thereby suppressing the noise signal and effectively extracting the main trend of wind speed fluctuations.
[0098] 3) The original wind speed sequence after noise reduction is divided into fluctuation processes using the swing window algorithm to obtain multiple wind speed fluctuation segments; the specific steps are as follows:
[0099] 3-1) Take the initial moment of the denoised wind speed sequence as the initial moment of the first wind speed fluctuation segment;
[0100] 3-2) Take the first wind speed fluctuation segment as the current wind speed fluctuation segment, record the initial time of the current wind speed fluctuation segment as t=0, and record the initial wind speed of the current wind speed fluctuation segment as v0. Then the initial wind speed of the first wind speed fluctuation segment is the initial wind speed of the denoised wind speed sequence.
[0101] 3-3) Calculate the swing window for the current wind speed fluctuation segment, as shown in the following expression:
[0102]
[0103] In the formula, S u For the upper swing window; S d ε is the width of the swing window. In this embodiment, the width ε of the swing window is taken as an empirical value and set to 5% of the maximum wind speed data; i is the time sequence number of the current wind speed fluctuation segment, v(i) is the wind speed at time i; t is the time sequence number of the last moment of the current wind speed fluctuation segment.
[0104] Starting from t=0, calculate the upper and lower swing windows, and take the value that satisfies S u ≥S d The minimum time t p Let t be the termination time of the current fluctuation process. p The conditions shown in equation (3) must be met:
[0105]
[0106] 3-4) t p Let t be the initial time of the next current wind speed fluctuation segment (i.e., the time t=0 of the new current wind speed fluctuation segment). p The wind speed at each moment is taken as the new v0, and then the process returns to step 3-3) to continue identifying the next fluctuation process; this continues until all moments corresponding to the wind speed sequence are assigned to their respective fluctuation processes, thus completing the division of the entire wind speed fluctuation process. It should be noted that the last wind speed fluctuation segment in the division result may not be a complete fluctuation process, but it is still treated as an independent wind speed fluctuation segment.
[0107] In one specific embodiment of the present invention, the filtered wind speed time data is divided into wind speed fluctuation processes using a swing window algorithm, resulting in 513 wind speed fluctuation segments. A schematic diagram of the swing window division of the fluctuation process is shown below. Figure 2 As shown in the figure, a portion of the wind speed fluctuation process is shown. The points represent the start and end points of each fluctuation process, the lines represent the wind speed sequence after noise reduction, and the lines between the points represent a fluctuation process.
[0108] 4) The wind speed fluctuation segments obtained in step 3) are converted into equal-length wind speed time series. A first-order clustering based on feature values is then performed on the wind speed time series to obtain a preliminary classification result of the wind speed fluctuation process. The specific steps are as follows:
[0109] 4-1) Convert the wind speed fluctuation segments obtained in step 3) into wind speed time series of equal length.
[0110] In this embodiment, the wind speed fluctuation segments after being divided by the swing window are unequal time length sequences. Therefore, the length of the longest fluctuation segment is selected as the baseline. Sequences shorter than this length are padded with zeros to complete the time length, thus transforming the unequal time length sequences into equal time length sequences, which are then stored in the processed equal-length wind speed time series matrix. In one embodiment of this invention, an equal-length wind speed time series matrix with a capacity of 513×87 is obtained, where 513 is the number of fluctuation segments obtained by the swing window division, and 87 is the length of the longest fluctuation segment.
[0111] 4-2) Based on the feature values of the selected time series wind speed data, calculate the feature values of each wind speed time series of equal length to form an feature value matrix.
[0112] The feature values extracted in one embodiment of the present invention include: mean, variance, fluctuation segment length, maximum wind speed, minimum wind speed, maximum value location, minimum value location, difference between maximum and minimum values, positive fluctuation duration, negative fluctuation duration, totaling 10 parameters.
[0113] Ten eigenvalues are calculated for each wind speed time series of equal length to obtain an eigenvalue matrix. In this embodiment, a matrix with a capacity of 513×10 is obtained, where 513 is the number of wave segments obtained by dividing the swing window, and 10 represents the 10 eigenvalues calculated for each wave segment.
[0114] 4-3) Perform a clustering operation on the eigenvalue matrix to obtain the clustering results of each wind speed fluctuation segment as the preliminary classification result of the wind speed sequence.
[0115] In this embodiment, the clustering method used is k-means clustering, and the elbow method based on the sum of the squared errors (SSE) is used to determine the optimal number of clusters K1 for a single clustering.
[0116] Figure 3 This is a schematic diagram illustrating the optimal number of clusters in a single-step clustering method based on feature values, as shown in a specific embodiment of the present invention. As can be seen from the elbow method, the optimal number of clusters is found at the inflection point of the image. Figure 3 As shown, the first inflection point is 2, indicating that the optimal number of clusters for a single clustering in this embodiment is 2, resulting in 2 wind speed fluctuation classifications. The clustering result based on feature values in this embodiment is as follows: Figure 4As shown, they are divided into two categories, A and B, as follows: Figure 4 a) and Figure 4 As shown in b).
[0117] 5) Based on the preliminary classification results of wind speed fluctuations obtained in step 4), perform secondary clustering based on time series similarity metrics on the equal-length wind speed time series under each category to obtain the final classification results of the wind speed fluctuation process. The specific steps are as follows:
[0118] To provide a more comprehensive characterization of wind speed time series, this invention employs an improved method for traditional similarity metrics. A weighted average of the absolute distance, growth rate distance, and fluctuation distance is used to obtain a comprehensive distance, which serves as a replacement for Euclidean distance in the improved k-means clustering method. This is then used to perform a secondary clustering on the primary clustering results obtained in step 4). The secondary clustering involves re-clustering each classification result obtained after the primary clustering.
[0119] In this embodiment, for any wind speed fluctuation classification obtained after the first clustering, the specific steps of the second clustering are as follows:
[0120] 5-1) Treat each equal-length wind speed time series under this wind speed fluctuation category as a data sample, and calculate three distance indicators between the data samples under this wind speed fluctuation category: absolute distance, speed-up distance, and fluctuation distance, as follows:
[0121] a) Absolute quantity Euclidean distance (AQED);
[0122] The absolute distance between sample i and sample j is:
[0123]
[0124] In the formula, x itk (i = 1, 2, ..., N; t = 1, 2, ..., T; k = 1, 2, ..., n) is the value of the k-th feature of the i-th sample at time t; N is the number of samples under this wind speed fluctuation classification; n is the number of feature values for each sample; T is the length of the time series for each sample (i.e., the length of the fluctuation process after being divided by the swing window algorithm and filled into an equal time series). The absolute distance index reflects the overall Euclidean distance between the indicators of the samples over the entire period T.
[0125] b) Increment speed Euclidean distance (ISED);
[0126] The growth rate distance between sample i and sample j is:
[0127]
[0128] In the formula, Δx itk =x itk -x it-1k Δx jtk =x jtk -x jt-1k Δx represents the absolute increment of the k-th feature value of the i-th and j-th samples at adjacent time points [t-1, t], respectively; itk / x it-1k Δx jtk / x jt-1k These represent the relative increments of the k-th feature value of the i-th and j-th samples at adjacent times [t-1, t].
[0129] c) Variation coefficient Euclidean distance (VCED);
[0130] The fluctuation distance between sample i and sample j is:
[0131]
[0132] in:
[0133]
[0134]
[0135]
[0136] In the formula, S ik These are the mean and standard deviation of the k-th feature value of the i-th sample over period T, respectively; c ik c jk These are the ratios of the standard deviation to the mean of the k-th eigenvalue of the i-th and j-th samples, respectively. Also known as the coefficient of variation, it is a normalized scale of the degree of variation of the k-th eigenvalue within period T.
[0137] 5-2) Determine the weights of each distance indicator;
[0138] This embodiment uses the entropy method to determine the objective weight of each distance indicator, and calculates the weight of each indicator by analyzing its information entropy. The specific steps are as follows:
[0139] 5-2-1) Standardize the various distance indicators:
[0140] According to the standardized formula, let X be the distance index value corresponding to the j-th sample under any wind speed fluctuation category. ij , for Xij The value after data standardization is Y ij :
[0141]
[0142] Where i = 1, 2, ..., N, N is the number of samples under this wind speed fluctuation category, and i ≠ j.
[0143] In this embodiment, after the standardization process is completed, a total of 3*2 normalized distance index matrices can be obtained (divided into two categories, with three distance indices in each category).
[0144] 5-2-2) Calculate the information entropy of each distance indicator;
[0145] For any wind speed fluctuation category, the information entropy calculation expression for the j-th distance index is as follows:
[0146]
[0147] in,
[0148]
[0149] The information entropy of each distance indicator under this fluctuation category is obtained as follows:
[0150] E1, E2, ..., E r
[0151] In the formula, r is the number of distance indicators, and in this embodiment, r = 3.
[0152] 5-2-3) Determine the weight of each distance indicator;
[0153] In this embodiment, the weights of each distance indicator are calculated using information entropy:
[0154]
[0155] 5-3) Calculate the overall distance between samples based on the weights;
[0156] In this embodiment, the comprehensive Euclidean distance (CED) between the i-th and j-th samples is calculated:
[0157] d ij,CED =w1*z(d ij,AQED )+w2*z(d ij,ISED )+w1*z(d ij,VCED (13)
[0158] In the formula, z(d ij,AQED ), z(dij,ISED ), z(d ij,VCED ) are respectively d ij,AQED d ij,ISED d ij,VCED The values are standardized to avoid the influence of orders of magnitude differences. w1, w2, and w3 are the weights of the corresponding distance indicators, and w1 + w2 + w3 = 1.
[0159] 5-4) Based on the comprehensive distance between samples, perform secondary clustering on samples under the same wind speed fluctuation category to obtain the final classification result of wind speed fluctuation.
[0160] In this embodiment, k-means clustering is used for the secondary clustering. The elbow method, based on the sum of squared errors (SSE), is also employed to determine the optimal number of clusters for secondary clustering based on time series similarity. In this embodiment, the optimal number of clusters for secondary clustering is as follows: Figure 5 As shown, the k-means quadratic clustering number for both classes A and B in the first clustering result is determined to be 2. The optimal number of quadratic clusters for class A is shown below. Figure 5 As shown in a), the optimal number of clusters for quadratic clustering of class B is as follows: Figure 5 As shown in b).
[0161] After the improved k-means algorithm based on time series similarity measure is used to perform secondary clustering on all wind speed fluctuation classifications obtained from the first clustering, the wind speed fluctuation process of the wind speed sequence obtained in step 1) is identified.
[0162] In this embodiment, the primary clustering result obtained by the K-means clustering method based on eigenvalues is used as the new original data for the secondary clustering. The Euclidean distance in the original k-means clustering is replaced with a comprehensive distance weighted by absolute distance, acceleration distance, and fluctuation distance, resulting in the secondary clustering result based on time series similarity measurement, which is the final result of wind speed fluctuation process classification. In this embodiment, the final result of wind speed fluctuation process classification is as follows: Figure 6 As shown, the two types of fluctuation processes, A and B, are each subjected to secondary clustering, resulting in four types of fluctuation processes: A1, A2, B1, and B2. The clustering results are shown below. Figure 6 As shown in a), 6b), 6c), and 6d), category A1 separates short-duration fluctuations from category A, and its fluctuation process is relatively simple, mostly unidirectional. Compared with category B2, category B1 has a more intense and shorter fluctuation process. Thus, a more accurate classification result of wind speed fluctuation processes is obtained.
[0163] This invention, after using the swing window algorithm to divide the wind speed sequence fluctuation process, employs a secondary clustering approach, combining two approaches to time series clustering: first, extracting feature values from the time series as initial clustering data; second, improving the similarity metric (distance) in the clustering algorithm; and finally, comprehensively obtaining the results. This approach overcomes the shortcomings of either clustering algorithm when used alone, further refining the clustering results to achieve more accurate and better results. The research in this embodiment demonstrates that, regarding the order of secondary clustering based on different metrics, performing a first clustering operation using feature values followed by k-means clustering of the time series yields better clustering results.
[0164] To implement the above embodiments, a second aspect of the present invention proposes a wind speed fluctuation process classification device based on a sliding window, comprising:
[0165] The wind speed sequence acquisition module is used to acquire the original wind speed sequence and perform noise reduction.
[0166] The wind speed fluctuation segment division module is used to divide the noise-reduced wind speed sequence into multiple wind speed fluctuation segments using a swing window algorithm.
[0167] A primary clustering module is used to convert the wind speed fluctuation segment into an equal-length wind speed time series, and to perform primary clustering based on feature values on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series.
[0168] The secondary clustering module is used to perform secondary clustering based on time series similarity metric on the equal-length wind speed time series under each category according to the preliminary classification results, so as to obtain the final classification result of the wind speed series.
[0169] It should be noted that the foregoing explanation of an embodiment of a wind speed fluctuation process classification method based on a sliding window also applies to a wind speed fluctuation process classification device based on a sliding window in this embodiment, and will not be repeated here. According to an embodiment of the present invention, a wind speed fluctuation process classification device based on a sliding window acquires the original wind speed sequence and performs noise reduction; the noise-reduced wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm; the wind speed fluctuation segments are converted into equal-length wind speed time series, and the equal-length wind speed time series are subjected to a first-order clustering based on feature values to obtain a preliminary classification result; based on the preliminary classification result, a second-order clustering based on time series similarity measures is performed on the equal-length wind speed time series under each category to obtain the final classification result. Therefore, the present invention comprehensively considers the influence of time series similarity and fluctuation process characteristics, performs second-order clustering on the fluctuation process to optimize the clustering effect, and can obtain more accurate wind speed fluctuation classification results.
[0170] To implement the above embodiments, a third aspect of the present invention provides an electronic device, comprising:
[0171] At least one processor; and a memory communicatively connected to said at least one processor;
[0172] The memory stores instructions that can be executed by the at least one processor, and the instructions are configured to perform the above-described sliding window-based wind speed fluctuation process classification method.
[0173] To implement the above embodiments, a fourth aspect of the present invention provides a computer-readable storage medium storing computer instructions for causing the computer to execute the above-described method for classifying wind speed fluctuation processes based on a sliding window.
[0174] It should be noted that the computer-readable medium described in this disclosure can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this disclosure, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.
[0175] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device. The aforementioned computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to perform a sliding window-based wind speed fluctuation process classification method according to the above embodiments.
[0176] Computer program code for performing the operations of this disclosure can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0177] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0178] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this application, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified.
[0179] Any process or method described in the flowchart or otherwise herein can be understood as representing a module, segment, or portion of code comprising one or more executable instructions for implementing a particular logical function or process, and the scope of the preferred embodiments of this application includes additional implementations in which functions may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order depending on the function involved, as will be understood by those skilled in the art to which embodiments of this application pertain.
[0180] The logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-including system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of computer-readable media include: an electrical connection having one or more wires (electronic device), a portable computer disk drive (magnetic device), random access memory (RAM), read-only memory (ROM), erasable and programmable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which programs can be printed, because programs can be obtained electronically, for example, by optically scanning the paper or other media, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.
[0181] It should be understood that various parts of this application can be implemented using hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented using software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.
[0182] Those skilled in the art will understand that all or part of the steps of the methods in the above embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
[0183] Furthermore, the functional units in the various embodiments of this application can be integrated into a processing module, or each unit can exist physically separately, or two or more units can be integrated into a module. The integrated module can be implemented in hardware or as a software functional module. If the integrated module is implemented as a software functional module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
[0184] The storage medium mentioned above can be a read-only memory, a disk, or an optical disk, etc. Although embodiments of this application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting this application. Those skilled in the art can make changes, modifications, substitutions, and variations to the above embodiments within the scope of this application.
Claims
1. A method for classifying wind speed fluctuation processes based on sliding windows, characterized in that, include: Obtain the original wind speed sequence and perform noise reduction; The denoised wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm; The wind speed fluctuation segment is converted into an equal-length wind speed time series, and a first-order clustering based on feature values is performed on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series. Based on the preliminary classification results, a second clustering based on time series similarity is performed on the equal-length wind speed time series under each category to obtain the final classification result of the wind speed series. Based on the preliminary classification results, a second-order clustering based on time series similarity is performed on the equal-length wind speed time series under each classification to obtain the final classification result of the wind speed series, including: 1) For any wind speed fluctuation classification obtained after one clustering, each equal-length wind speed time series under that classification is taken as a sample. Calculate three distance indicators among the samples under that classification: absolute distance, speed-up distance, and fluctuation distance, as follows: 1-1) Absolute distance; The absolute distance between sample i and sample j is: , In the formula, is the value of the kth feature of the ith sample at the tth time point, N is the number of samples in the wind speed fluctuation classification; is the number of feature values of each sample; T is the length of the time series of each sample; 1-2) Growth rate gap; The growth rate distance between sample i and sample j is: In the formula, represents the absolute increment of the kth feature value of the ith sample at adjacent time [t-1, t]; represents the relative increment of the kth feature value of the ith sample at adjacent time [t-1, t]; 1-3) Fluctuation distance; The fluctuation distance between sample i and sample j is: in: wherein , are the average and standard deviation of the kth feature value of the ith sample over the T epochs, respectively; represents the ratio of the standard deviation to the average of the kth feature value of the ith sample. 2) Determine the weights of each distance indicator under this wind speed fluctuation category; the specific steps are as follows: 2-1) Standardize each distance indicator; Let any wind speed fluctuation classification, the first j distance index value corresponding to the first sample i is ; The values after data standardization are : 2-2) Calculate the information entropy of the distance index; For any wind speed fluctuation category, the information entropy calculation expression for the j-th distance index is as follows: in, The information entropy of each distance indicator under this fluctuation category is obtained as follows: In the formula, r is the number of distance indicators; 2-3) Determine the weights of each distance indicator: 3) Calculate the overall distance between samples under this wind speed fluctuation category based on the weights; The expression for calculating the combined distance between the i-th sample and the j-th sample is as follows: In the formula, , , They are , , The value after standardization transformation; , , These are the weights for each distance indicator. ; 4) Based on the comprehensive distance between samples, perform secondary clustering on samples under the same wind speed fluctuation category to obtain the final classification result of the wind speed sequence.
2. The method of claim 1, wherein, The noise reduction method employs wavelet analysis.
3. The method of claim 1, wherein, Both the first-order clustering and the second-order clustering use the k-means clustering method, and the optimal number of clusters is determined by the elbow method based on the sum of squared errors.
4. The method of claim 1, wherein, The denoised wind speed sequence is divided into multiple wind speed fluctuation segments using a swing window algorithm, including: 1) Take the initial moment of the denoised wind speed sequence as the initial moment of the first wind speed fluctuation segment; 2) Take the first wind speed fluctuation segment as the current wind speed fluctuation segment, record the initial time of the current wind speed fluctuation segment as t=0, and record the initial wind speed of the current wind speed fluctuation segment as... ; 3) Calculate the swing window for the current wind speed fluctuation segment, using the following expression: In the formula, For the upper swing window; ε represents the width of the swing window; ε is the width of the swing window. i is the time number of the current wind speed fluctuation segment, v(i) is the wind speed at time i; t is the time number of the last time segment of the current wind speed fluctuation segment. Taking the time t=0 as the starting point, the upper and lower swing windows are calculated respectively, and the minimum time is taken as the end time of the current wind speed fluctuation section, satisfying the condition shown in the following formula: 4) For the initial moment of the next current wind speed fluctuation segment, Wind speed at any moment as the updated Then return to step 3) and continue to identify the next fluctuation segment; until the wind speed at all times in the denoised wind speed sequence is divided into the corresponding wind speed fluctuation segment, and the division is completed.
5. The method of claim 1, wherein, The process of converting the wind speed fluctuation segment into a wind speed time series of equal length, and performing a first-order clustering based on feature values on the wind speed time series to obtain a preliminary classification result of the wind speed series includes: 1) Convert each wind speed fluctuation segment into a wind speed time series of equal length; 2) Based on the feature values of the selected time-series wind speed data, calculate the feature values of each wind speed time series of equal length to form an feature value matrix. 3) Perform a clustering operation on the eigenvalue matrix to obtain the clustering results of each wind speed fluctuation segment as the preliminary classification result of the wind speed sequence.
6. The method of claim 5, wherein, The characteristic values include: mean, variance, fluctuation segment length, maximum wind speed, minimum wind speed, location of maximum value, location of minimum value, difference between maximum and minimum value, duration of positive fluctuation, and duration of negative fluctuation.
7. A sliding window based wind speed fluctuation process classification apparatus characterized by, include: The wind speed sequence acquisition module is used to acquire the original wind speed sequence and perform noise reduction. The wind speed fluctuation segment division module is used to divide the noise-reduced wind speed sequence into multiple wind speed fluctuation segments using a swing window algorithm. A primary clustering module is used to convert the wind speed fluctuation segment into an equal-length wind speed time series, and to perform primary clustering based on feature values on the equal-length wind speed time series to obtain the preliminary classification result of the wind speed series. The secondary clustering module is used to perform secondary clustering based on time series similarity measure on the equal-length wind speed time series under each category according to the preliminary classification results, so as to obtain the final classification result of the wind speed series. The secondary clustering module is used to perform the following steps: 1) For any wind speed fluctuation classification obtained after one clustering, each equal-length wind speed time series under that classification is taken as a sample. Calculate three distance indicators among the samples under that classification: absolute distance, speed-up distance, and fluctuation distance, as follows: 1-1) Absolute distance; The absolute distance between sample i and sample j is: , In the formula, It is the value of the k-th feature of the i-th sample at time t. N represents the number of samples under this wind speed fluctuation category; The number of feature values for each sample; T is the length of the time series for each sample; 1-2) Growth rate gap; The growth rate distance between sample i and sample j is: In the formula, It represents the absolute increment of the k-th eigenvalue of the i-th element at adjacent times [t-1, t]. The value represents the relative increment of the k-th feature of the i-th sample at adjacent times [t-1, t]. 1-3) Fluctuation distance; The fluctuation distance between sample i and sample j is: in: In the formula, , Let be the mean and standard deviation of the k-th feature value of the i-th sample over period T, respectively. This represents the ratio of the standard deviation to the mean of the k-th feature value of the i-th sample; 2) Determine the weights of each distance indicator under this wind speed fluctuation category; the specific steps are as follows: 2-1) Standardize each distance indicator; Let any wind speed fluctuation classification, the first j distance index value corresponding to the first sample i is ; right The value after data standardization is : 2-2) Calculate the information entropy of the distance index; For any wind speed fluctuation classification, the information entropy of the jth distance indicator is calculated as follows: in, The information entropy of each distance indicator under this fluctuation category is obtained as follows: In the formula, r is the number of distance indicators; 2-3) Determine the weights of each distance indicator: 3) Calculate the overall distance between samples under this wind speed fluctuation category based on the weights; The expression for calculating the combined distance between the i-th sample and the j-th sample is as follows: wherein , , are respectively , , values after standardization transformation; , , are respectively weights corresponding to each distance index, ; 4) Based on the comprehensive distance between samples, perform secondary clustering on samples under the same wind speed fluctuation category to obtain the final classification result of the wind speed sequence.
8. An electronic device, comprising: include: At least one processor; And, a memory communicatively connected to the at least one processor; The memory stores instructions executable by the at least one processor, the instructions being configured to perform the method described in any one of claims 1-6.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing the computer to perform the method according to any one of claims 1-6.