A fatigue test data anomaly-oriented preprocessing method
By identifying and repairing abnormal data segments in fatigue test data using the interpolation method, and combining this with filtering algorithms, the problems of incomplete and abnormal data in fatigue tests were solved, thus improving the accuracy and efficiency of data analysis.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA AIRPLANT STRENGTH RES INST
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-19
AI Technical Summary
In full-scale fatigue strength tests, a large amount of incomplete, inconsistent, and abnormal data exists in the massive amount of raw data, which affects the efficiency of data mining and modeling, and may even lead to deviations in results.
The difference method is used to identify and repair abnormal data segments that are zeroed out, duplicated, or have abrupt changes, and non-significant noise is smoothed by first-order low-pass filtering or Kalman filtering.
Effective cleaning of fatigue test data eliminates the influence of noise and anomalies, improving the accuracy of data monitoring and early warning.
Smart Images

Figure CN122241002A_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the field of electrical data processing technology, and specifically relates to a preprocessing method for fatigue test data anomalies. Background Technology
[0002] The principle of full-scale fatigue strength testing is to simulate the boundary state of an aircraft structure using a support system. Hydraulic and pneumatic equipment, along with corresponding mechanical connections and control systems, are used to apply a quasi-static load sequence to the structure, simulating its actual stress history. Because fatigue testing involves long-term repeated loading, strain gauges inevitably malfunction during long-term testing. Therefore, the massive amounts of raw data contain a large number of incomplete (missing values), inconsistent, and anomaly-laden data, severely impacting the efficiency of data mining and modeling, and potentially leading to low data utilization. Thus, the definition and identification of anomalies are particularly important. In data analysis, experimental data frequently encounters missing, incomplete, or inaccurate data. These problems can lead to biased data analysis results, or even prevent the drawing of correct conclusions. Summary of the Invention
[0003] To address the aforementioned problems, this application provides a preprocessing method for fatigue test data anomalies, comprising:
[0004] Step 1: Obtain one-dimensional time-series fatigue test data;
[0005] Step 2: Define the abnormal features of data returning to zero, data duplication, and data jump based on the preset abnormal feature definitions. Based on the preset abnormal feature definitions, traverse the test data to identify abnormal data segments of the three types: data returning to zero, data duplication, and data jump.
[0006] Step 3: Use the difference method to repair the abnormal data segments that are zeroed out, duplicated, or have abrupt changes.
[0007] Preferably, the abnormal characteristic of data returning to zero is defined as follows: in continuous data acquisition, the data collected X times are all zero, and the variance calculated between the consecutive zero data segment and the non-zero data before and after it exceeds the threshold W, where X and W are preset hyperparameters.
[0008] Preferably, the specific steps for identifying abnormal data segments that have returned to zero include:
[0009] Step 21: Set hyperparameters X and W, and input one-dimensional time-series fatigue test data for target detection; Step 22: Start traversing from the data corresponding to the first index of the fatigue test data; When a data value of zero is encountered, start recording the number of times the consecutive zero data segment is zero; When a non-zero data is encountered, recording stops, and the number of consecutive zero data points and the variance of the consecutive zero data segment calculated from the data before and after it are obtained. Step 23: If the number of consecutive zeros is greater than or equal to X, and the calculated variance is greater than or equal to W, then output the start and end indices of the consecutive zero data segment. Step 24: Continue traversing from the next index after the ending index of the consecutive zero data segment until the end of the data.
[0010] Preferably, the abnormal feature of data repetition is defined as follows: in continuous data acquisition, the data collected X times in a row are all the same, and the variance calculated between the continuously repeated data segment and the data before and after it is not equal exceeds the threshold W.
[0011] Preferably, the specific steps for identifying anomalous data segments with duplicate data include:
[0012] Step 31: Set hyperparameters X and W, and input one-dimensional time-series fatigue test data for target detection;
[0013] Step 32: Start traversing the data from the index of the first fatigue test data and initialize a temporary array to record the indices of consecutively identical values;
[0014] Step 33: Determine if the current data value is the same as the previous data value: If they are the same, add the current index to the temporary array; if they are not the same, determine if the number of elements recorded in the temporary array is greater than or equal to X; if so, calculate the variance of the data segment corresponding to the temporary array and the data before and after it. If the variance is greater than or equal to W, output the start and end indices of the temporary array; regardless of whether the variance condition is met, then clear the temporary array.
[0015] Step 34: Continue traversing from the index of the next fatigue test data recorded in the temporary array, return to step 33, until all fatigue test data has been traversed.
[0016] Preferably, the abnormal characteristics of data jumps are defined as follows: for three consecutive data collection values [V1,V2,V3], the variance S1 of the three data and the variance S2 of the data [V1,0.5(V1+V3),V3] are calculated, and the ratio of variance S1 to variance S2 exceeds Q, where Q is a preset hyperparameter.
[0017] Preferably, step 41: set the hyperparameter Q and input the one-dimensional time-series fatigue test data of the target detection;
[0018] Step 42: Start traversing from the fatigue test data corresponding to the second index to the fatigue test data corresponding to the second-to-last index;
[0019] Step 43: For the fatigue test data value V corresponding to the current index i i ,
[0020] Take adjacent values V i-1 With V i+1 Construct array [V i-1 V i V i+1 ], calculate its variance S1;
[0021] Construct array [V i-1 0.5*(V i-1 +V i+1 ),V i+1 ], calculate its variance S2;
[0022] Where V i-1 With V i+1 If they are equal, then V i+1 Add 1 to the value to avoid the denominator being 0 when calculating the variance;
[0023] Step 44: Determine if the ratio of S1 to S2 is greater than Q. If it is, output index i.
[0024] Step 45: Continue traversing from the index of the next fatigue test data, return to step 43, until all fatigue test data has been traversed.
[0025] Preferably, in step 3, the specific steps for repair using the difference method are as follows:
[0026] Let the abnormal data segment be [V] i+1 V i+2 V i+3 ...V i+n ];
[0027] Record the previous data value V of the abnormal data segment i With the next data value V i+n+1 , and the number of abnormal data n in the abnormal data segment;
[0028] The replacement value for each data in the abnormal data segment is calculated using the following formula:
[0029] ;
[0030] Replace each data in the abnormal data segment with the replacement value of each data.
[0031] Preferably, after completing step 3, the method further includes step 4: applying a filtering algorithm to the repaired data to smooth out non-significant noise, wherein the filtering algorithm is a first-order low-pass filter or a Kalman filter.
[0032] This application can be effectively used for cleaning strength fatigue test data, eliminating the impact of data noise and anomalies on data monitoring and early warning. Attached Figure Description
[0033] Appendix Figure 1 Flowchart of the data zeroing identification method.
[0034] Appendix Figure 2 This is a flowchart of a data duplication identification method.
[0035] Appendix Figure 3 This is a flowchart of a data jump recognition method.
[0036] Appendix Figure 4 Line chart of the original data to adapt.
[0037] Appendix Figure 5 Line charts were incorrectly retrieved for strain data.
[0038] Appendix Figure 6 Line chart showing the results of abnormal data repair. Detailed Implementation
[0039] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions in the embodiments of this application will be described in more detail below with reference to the accompanying drawings. In the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The described embodiments are only some embodiments of this application, not all embodiments. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain this application, and should not be construed as limiting this application. All other embodiments obtained by those skilled in the art based on the embodiments in this application without creative effort are within the scope of protection of this application. The embodiments of this application will be described in detail below with reference to the accompanying drawings. This application provides a preprocessing method for fatigue test data anomalies, including:
[0040] Step 1: Obtain one-dimensional time-series fatigue test data. Each data point has an index i and is arranged in order.
[0041] Step 2: Define the abnormal features of data returning to zero, data duplication, and data jump based on the preset abnormal feature definitions. Based on the preset abnormal feature definitions, traverse the test data to identify abnormal data segments of the three types: data returning to zero, data duplication, and data jump.
[0042] Step 3: Use the difference method to repair the abnormal data segments that are zeroed out, duplicated, or have abrupt changes.
[0043] In some alternative implementations, the abnormal characteristic of data returning to zero is defined as follows: in continuous data acquisition, the data collected X times in a row are all zero, and the variance calculated between the consecutive zero data segment and the non-zero data before and after it exceeds a threshold W, where X and W are preset hyperparameters.
[0044] In some alternative implementations, the specific steps for identifying anomalous data segments that have returned to zero include:
[0045] Step 21: Set hyperparameters X and W, and input one-dimensional time-series fatigue test data for target detection;
[0046] Step 22: Iterate through the fatigue test data starting from the first data point, such as... Figure 1 As shown, this means traversing the data starting from index 1;
[0047] When a data value of zero is encountered, start recording the number of times the consecutive zero data segment is zero;
[0048] When a non-zero value is encountered again, recording stops, and the number of consecutive zeros and the variance of the consecutive zero data segment calculated from the data before and after it are obtained.
[0049] Step 23: When x is consecutively zero i Greater than or equal to X, and the calculated variance w i If the value is greater than or equal to W, then output the start and end indices of the continuous zero-valued data segment; x i In this context, x represents the number of times the consecutive zero data segment is zero, and i represents the ending index of the consecutive zero data segment. i In this context, w represents the variance calculated from the consecutive zero data segment and the data before and after it, and i represents the stop index of the consecutive zero data segment.
[0050] Step 24: Continue traversing from the next index after the ending index of the consecutive zero data segment until the end of the data.
[0051] In some alternative implementations, the abnormal characteristic of data repetition is defined as follows: in continuous data acquisition, the data collected X times in a row are all the same, and the variance calculated between the continuously repeated data segment and the data before and after it is not equal exceeds a threshold W.
[0052] In some alternative implementations, such as Figure 2 As shown, the specific steps for identifying abnormal data segments with duplicate data include:
[0053] Step 31: Set hyperparameters X and W, and input one-dimensional time-series fatigue test data for target detection;
[0054] Step 32: Start traversing the data from the index of the first fatigue test data and initialize a temporary array to record the indices of consecutively identical values;
[0055] Step 33: Determine if the current data value is the same as the previous data value: If they are the same, add the current index to the temporary array; if they are not the same, determine if the number of elements 'a' recorded in the temporary array is greater than or equal to X; if so, calculate the variance 'b' between the data segment corresponding to the temporary array and the data before and after it. If the variance 'b' is greater than or equal to W, output the start and end indices of the temporary array; regardless of whether the variance condition is met, then clear the temporary array.
[0056] Step 34: Continue traversing from the index of the next fatigue test data recorded in the temporary array, return to step 33, until all fatigue test data has been traversed.
[0057] In some alternative implementations, the abnormal characteristics of data jumps are defined as follows: for three consecutive data collection values [V1, V2, V3], the variance S1 of the three data and the variance S2 of the data [V1, 0.5(V1+V3), V3] are calculated, and the ratio of variance S1 to variance S2 exceeds Q, where Q is a preset hyperparameter.
[0058] In some alternative implementations, step 41: set the hyperparameter Q and input the one-dimensional time-series fatigue test data of the target detection;
[0059] Step 42: Start traversing from the fatigue test data corresponding to the second index to the fatigue test data corresponding to the second-to-last index;
[0060] Step 43: For the fatigue test data value V corresponding to the current index i i ,
[0061] Take adjacent values V i-1 With V i+1 Construct array [V i-1 V i V i+1 ], calculate its variance S1;
[0062] Construct array [V i-1 0.5*(V i-1 +V i+1 ),V i+1 ], calculate its variance S2;
[0063] Where V i-1 With V i+1 If they are equal, then V i+1 Add 1 to the value to avoid the denominator being 0 when calculating the variance;
[0064] Step 44: Determine if the ratio of S1 to S2 is greater than Q. If it is, output index i.
[0065] Step 45: Continue traversing from the index of the next fatigue test data, return to step 43, until all fatigue test data has been traversed.
[0066] In some alternative implementations, in step 3, such as Figure 3 As shown, the specific steps for repair using the difference method are as follows:
[0067] Let the abnormal data segment be [V] i+1 V i+2 V i+3 ...V i+n ];
[0068] Record the previous data value V of the abnormal data segment i With the next data value V i+n+1 , and the number of abnormal data n in the abnormal data segment;
[0069] The replacement value for each data in the abnormal data segment is calculated using the following formula:
[0070] ;
[0071] Replace each data in the abnormal data segment with the replacement value of each data.
[0072] In some alternative implementations, after completing step 3, a further step 4 is included: applying a filtering algorithm to the repaired data to smooth out non-significant noise, wherein the filtering algorithm is a first-order low-pass filter or a Kalman filter.
[0073] In some alternative implementations, as follows Figure 4 The raw strain data shown is processed by traversing the array and sequentially checking for data zeroing, data duplication, and data jumps. For the three returned erroneous data index values, the union is found; this set represents the data indices to be repaired. The program data results are as follows: Indices 71 to 76 have an abnormal data value of 0; indices 32 to 39 have data repeated 8 times, with a repetition value of 20; indices 71 to 76 have data repeated 6 times, with a repetition value of 0; index 8 has a value of 30, with significant variance fluctuations, possibly indicating an abnormal jump point; index 43 has a value of 50, with significant variance fluctuations, possibly indicating an abnormal jump point. The following anomalies were identified: Figure 5 As shown in the figure. Further repair work was carried out on the above experimental data to address data zeroing, data duplication, and anomalous jumps. The interpolation method was used for repair, and the repair results are shown in the figure. Figure 6As shown. Without affecting the judgment of changes in experimental data, non-significant noise can be selectively repaired, and this repair must be performed on data after the first three types of anomalies have been repaired. When non-significant noise repair is required, first-order low-pass filtering or Kalman filtering methods can be used. First-order low-pass filtering is used to remove the high-frequency components of the signal while retaining the low-frequency components. Kalman filtering calculates the current optimal value based on the current "measured value" and the "predicted value" and "error" of the previous moment, and then predicts the value for the next moment.
[0074] This application can be effectively used for cleaning strength fatigue test data during or after the test, eliminating the impact of data noise and anomalies on data monitoring and early warning.
[0075] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A preprocessing method for fatigue test data anomalies, characterized in that, include: Step 1: Obtain one-dimensional time-series fatigue test data; Step 2: Define the abnormal features of data returning to zero, data duplication, and data jump based on the preset abnormal feature definitions. Based on the preset abnormal feature definitions, traverse the test data to identify abnormal data segments of the three types: data returning to zero, data duplication, and data jump. Step 3: Use the difference method to repair the abnormal data segments that are zeroed out, duplicated, or have abrupt changes.
2. The preprocessing method for fatigue test data anomalies as described in claim 1, characterized in that, The abnormal characteristic of data returning to zero is defined as follows: in continuous data acquisition, the data collected for X consecutive times is all zero, and the variance calculated between the consecutive zero data segment and the non-zero data before and after it exceeds the threshold W, where X and W are preset hyperparameters.
3. The preprocessing method for fatigue test data anomalies as described in claim 2, characterized in that, The specific steps for identifying abnormal data segments that have returned to zero include: Step 21: Set hyperparameters X and W, and input one-dimensional time-series fatigue test data for target detection; Step 22: Start traversing from the data corresponding to the first index of the fatigue test data; When a data value of zero is encountered, start recording the number of times the consecutive zero data segment is zero; When a non-zero data is encountered, recording stops, and the number of consecutive zero data points and the variance of the consecutive zero data segment calculated from the data before and after it are obtained. Step 23: If the number of consecutive zeros is greater than or equal to X, and the calculated variance is greater than or equal to W, then output the start and end indices of the consecutive zero data segment. Step 24: Continue traversing from the next index after the ending index of the consecutive zero data segment until the end of the data.
4. The preprocessing method for fatigue test data anomalies as described in claim 1, characterized in that, The abnormal characteristic of data repetition is defined as follows: in continuous data collection, the data collected X times in a row are all the same, and the variance calculated between the continuously repeated data segment and the data before and after it is not equal exceeds the threshold W.
5. The preprocessing method for fatigue test data anomalies as described in claim 4, characterized in that, The specific steps for identifying anomalous data segments with duplicate data include: Step 31: Input one-dimensional time-series fatigue test data for target detection; Step 32: Start traversing the data from the index of the first fatigue test data and initialize a temporary array to record the indices of consecutively identical values; Step 33: Determine if the current data value is the same as the previous data value: If they are the same, add the current index to the temporary array; if they are not the same, determine if the number of elements recorded in the temporary array is greater than or equal to X; if so, calculate the variance of the data segment corresponding to the temporary array and the data before and after it. If the variance is greater than or equal to W, output the start and end indices of the temporary array; regardless of whether the variance condition is met, then clear the temporary array. Step 34: Continue traversing from the index of the next fatigue test data recorded in the temporary array, return to step 33, until all fatigue test data has been traversed.
6. The preprocessing method for fatigue test data anomalies as described in claim 1, characterized in that, The abnormal characteristics of data jumps are defined as follows: for three consecutive data collection values [V1,V2,V3], the variance S1 of the three data and the variance S2 of the data [V1,0.5(V1+V3),V3] are calculated. The ratio of variance S1 to variance S2 exceeds Q, where Q is a preset hyperparameter.
7. The preprocessing method for fatigue test data anomalies as described in claim 6, characterized in that, Step 41: Set the hyperparameter Q and input the one-dimensional time-series fatigue test data of the target detection; Step 42: Start traversing from the fatigue test data corresponding to the second index to the fatigue test data corresponding to the second-to-last index; Step 43: For the fatigue test data value V corresponding to the current index i i , Take adjacent values V i-1 With V i+1 Construct array [V i-1 V i V i+1 ], calculate its variance S1; Construct array [V i-1 0.5*(V i-1 +V i+1 ),V i+1 ], calculate its variance S2; Where, if V i-1 With V i+1 If they are equal, then V i+1 Add 1 to the value to avoid the denominator being 0 when calculating the variance; Step 44: Determine if the ratio of S1 to S2 is greater than Q. If it is, output index i. Step 45: Continue traversing from the index of the next fatigue test data, return to step 43, until all fatigue test data has been traversed.
8. The preprocessing method for fatigue test data anomalies as described in claim 1, characterized in that, In step 3, the specific steps for repair using the difference method are as follows: Let the abnormal data segment be [V] i+1 V i+2 V i+3 ...V i+n ]; Record the previous data value V of the abnormal data segment i With the next data value V i+n+1 , and the number of abnormal data n in the abnormal data segment; The replacement value for each data in the abnormal data segment is calculated using the following formula: ; Replace each data in the abnormal data segment with the replacement value of each data.
9. The preprocessing method for fatigue test data anomalies as described in claim 1, characterized in that, After completing step 3, the process further includes step 4: applying a filtering algorithm to smooth the repaired data for non-significant noise, wherein the filtering algorithm is a first-order low-pass filter or a Kalman filter.