A method and system for predicting hardware device failures
By denoising load and vibration signals and fusing multi-source data, combined with support vector machines and Wiener process models, the problem of lag in fault prediction under complex working conditions was solved, and accurate fault warning and adaptive monitoring of hardware equipment were realized.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING XURI SHENZHOU TECH CO LTD
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies struggle to balance fault prediction accuracy and response timeliness under complex and variable operating conditions, resulting in delayed fault warnings for hardware devices.
By acquiring load intensity and vibration signals, low-pass filtering and wavelet transform denoising are performed to construct a damage feature matrix. Combined with a support vector machine classifier and Wiener process model, the cumulative fatigue loss and remaining life index are calculated to form a closed-loop monitoring system to optimize the loss model.
It enables precise quantification of fatigue loss under dynamic loads, improves the accuracy and timeliness of fault prediction, adapts to dynamic changes in equipment operating conditions, and enhances the long-term reliability and self-learning capability of fault prediction.
Smart Images

Figure CN122196710A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of industrial equipment monitoring and fault early warning technology, and in particular to a method and system for predicting hardware equipment faults. Background Technology
[0002] With the rapid improvement of industrial intelligence, fault prediction and health management of hardware equipment have become key research directions in the field of industrial equipment management.
[0003] In a current technology, basic data collection and preliminary analysis of equipment surface operation are typically performed. Based on the collected basic data such as load and vibration, fault prediction is made using fixed preset thresholds or a single algorithm to achieve preliminary monitoring of equipment operating status. However, when faced with the diversity of equipment operating conditions and frequent fluctuations in dynamic loads, current technologies lack the ability to fuse and accurately process multi-source signals. This makes it difficult to accurately quantify the cumulative patterns of fatigue wear within the equipment and to adapt to the dynamic changes in complex operating conditions. Consequently, there is a significant deviation between the fault prediction results and the actual fault situation, and the accuracy and timeliness of the early warning cannot meet practical needs.
[0004] Therefore, existing technologies lack the ability to accurately quantify fatigue loss under dynamic loads and the ability to fuse and analyze multi-source data, making it impossible to achieve a balance between fault prediction accuracy and response timeliness under complex and ever-changing working conditions, resulting in a problem of delayed fault warning in hardware devices. Summary of the Invention
[0005] This invention provides a hardware device fault prediction method and system to solve the technical problem that existing technologies cannot achieve a balance between fault prediction accuracy and response timeliness under complex and ever-changing working conditions, resulting in delayed fault warnings in hardware devices.
[0006] Firstly, in order to solve the above-mentioned technical problems, the present invention provides a hardware device fault prediction method, comprising: Acquire load intensity and vibration signals, preprocess the load intensity and vibration signals to obtain denoised load data and vibration feature sequence; Based on the denoised load data, the load fluctuation amplitude and cumulative stress cycle are calculated to obtain the damage feature matrix. The damage feature matrix is then input into the pre-trained loss model to obtain the cumulative fatigue loss value. When the cumulative fatigue loss value does not exceed the preset safety threshold, load monitoring data is acquired, the loss growth rate is calculated based on the load monitoring data, time series features are extracted from the loss growth rate, and the time series features are classified using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; When the operational status classification result is a high-risk state, the historical loss sequence is obtained. Based on the pre-trained Wiener process model, the operational status classification result and the historical loss sequence are integrated and calculated to obtain the remaining lifetime index. Acquire real-time load data, calculate risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. Based on the risk assessment report, the vibration characteristic sequence and the cumulative fatigue loss value are integrated to update the parameters of the pre-trained loss model, resulting in an optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.
[0007] Preferably, the step of acquiring load intensity and vibration signal, and preprocessing the load intensity and vibration signal to obtain denoised load data and vibration feature sequence includes: The load intensity and vibration signal are acquired, and the load intensity is low-pass filtered to obtain denoised load data. The vibration signal is subjected to wavelet transform to obtain a pure vibration signal, and feature parameters are extracted from the pure vibration signal to generate a vibration feature sequence.
[0008] Preferably, the step of calculating the load fluctuation amplitude and cumulative stress cycle based on the denoised load data to obtain a damage feature matrix, and inputting the damage feature matrix into a pre-trained loss model to obtain the cumulative fatigue loss value, includes: The waveform geometric features of the denoised load data are analyzed, and the amplitude difference between adjacent peaks and troughs is calculated based on the waveform geometric features to obtain the load fluctuation amplitude. The cumulative force cycle is calculated based on the waveform geometric features. The load fluctuation amplitude is paired with the cumulative stress cycle to obtain a cyclic feature vector. All the cyclic feature vectors are concatenated to obtain a damage feature matrix. The damage feature matrix is input into the pre-trained loss model, and the cumulative fatigue loss value is output through nonlinear regression calculation.
[0009] Preferably, calculating the loss growth rate based on the load monitoring data includes: Differential calculations are performed on the load monitoring data to obtain a differential sequence, and the differential sequence is smoothed to form a loss gradient vector; The loss gradient vector is mapped to a dynamic environmental state space consisting of the load mean, fluctuation standard deviation, and temperature deviation, and the loss growth rate is obtained by fitting the data using a multivariate regression method.
[0010] Preferably, the step of extracting time-series features from the loss growth rate and classifying the time-series features using a preset support vector machine classifier to obtain the running state classification result includes: The loss growth rate is organized into a data sequence, and the data sequence is truncated using a preset sliding window. The mean, variance, and peak statistics of the data within each sliding window are calculated to construct a time-domain statistical feature set. The data sequence of the loss growth rate is subjected to discrete wavelet transform and decomposed into sub-bands at multiple scales. The energy coefficients of each sub-band are extracted. The time series features of the loss growth rate are the set of time-domain statistical features and the set of energy coefficients. The time-domain statistical feature set and the energy coefficient are standardized and then concatenated in a preset order to form a multi-dimensional comprehensive feature vector. The multidimensional integrated feature vector is input into a preset support vector machine classifier for nonlinear transformation to determine the optimal decision hyperplane. Calculate the geometric distance from the multidimensional integrated feature vector to the optimal decision hyperplane, and output the operation state classification result based on the sign attribute of the geometric distance; wherein, the operation state classification result includes normal operation state and high-risk state.
[0011] Preferably, when the operational status classification result indicates a high-risk state, a historical loss sequence is obtained. Based on a pre-trained Wiener process model, the operational status classification result and the historical loss sequence are integrated and calculated to obtain a remaining lifetime index, including: When the operation status classification result is a high-risk state, the unique device identifier of the current hardware device to be monitored and the timestamp information corresponding to the high-risk state are obtained, and the historical fatigue loss accumulation value within a preset time span is filtered in the preset device loss database. The historical loss sequence is input into the pre-trained Wiener process model to obtain the drift coefficient and diffusion coefficient; Based on the drift coefficient and the diffusion coefficient, derive the time probability density function of the time required for the cumulative fatigue loss value to first reach the preset failure threshold; Integrating the time probability density function yields the remaining lifetime index.
[0012] Preferably, the step of acquiring real-time load data, calculating the risk probability based on the real-time load data and the remaining lifespan indicator, generating a preventive maintenance signal when the risk probability exceeds a preset risk threshold, and otherwise adjusting the monitoring frequency and generating a risk assessment report includes: Acquire real-time load data and extract the load fluctuation characteristics of the real-time load data; The load fluctuation characteristics are fused with the remaining life index to construct an operating status feature vector. The operating status feature vector is then input into a pre-trained joint probability distribution model to obtain the risk probability. Determine whether the risk probability exceeds a preset risk threshold. If yes, generate a preventive maintenance signal; otherwise, reset the monitoring frequency based on the risk probability to obtain the reset monitoring frequency. The remaining lifespan index is mapped to a preset failure risk assessment matrix, and a quantified failure risk assessment level is output. The system obtains the historical maintenance signal trigger frequency, calculates the system stability level value based on the preventive maintenance signal and the reset monitoring frequency, and generates a risk assessment report based on the system stability level value, the risk probability, and the fault risk assessment level.
[0013] Preferably, the step of updating the parameters of the pre-trained loss model based on the risk assessment report, fusing the vibration characteristic sequence and the cumulative fatigue loss value to obtain an optimized loss model, and using the optimized loss model in the next data acquisition process to form a closed-loop monitoring includes: The abnormal state markers in the risk assessment report are analyzed, and the vibration feature sequence and fatigue loss cumulative value corresponding to the abnormal state markers are extracted to construct an incremental learning sample set. The incremental learning sample set is input into the pre-trained loss model to obtain the model's predicted state value; Calculate the deviation between the model's predicted state value and the abnormal state marker, and generate the model weight gradient vector based on the deviation value; The internal parameters of the loss model are adjusted according to the model weight gradient vector to obtain the optimized loss model. Based on the prediction accuracy of the optimized loss model, the data sampling interval is calculated and written into the acquisition register to drive the sensor to acquire data in a new round, thus forming a closed-loop monitoring.
[0014] Secondly, the present invention provides a hardware device fault prediction system, comprising: The data preprocessing module is used to acquire load intensity and vibration signal, preprocess the load intensity and vibration signal to obtain denoised load data and vibration feature sequence; The loss calculation module is used to calculate the load fluctuation amplitude and cumulative stress cycle based on the denoised load data, obtain the damage feature matrix, and input the damage feature matrix into the pre-trained loss model to obtain the cumulative fatigue loss value. The status classification module is used to acquire load monitoring data when the cumulative fatigue loss value does not exceed a preset safety threshold, calculate the loss growth rate based on the load monitoring data, extract time series features from the loss growth rate, and classify the time series features using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; The lifetime calculation module is used to obtain the historical loss sequence when the operation status classification result is a high-risk state, and to integrate the operation status classification result and the historical loss sequence according to the pre-trained Wiener process model to calculate the remaining lifetime index. The risk assessment module is used to acquire real-time load data, calculate the risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. The closed-loop update module is used to update the parameters of the pre-trained loss model based on the risk assessment report, by integrating the vibration characteristic sequence and the cumulative fatigue loss value, to obtain the optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.
[0015] Compared with the prior art, the present invention has the following beneficial effects: (1) This invention uses low-pass filtering and wavelet transform to perform targeted denoising on load intensity data and vibration signals respectively. Combined with waveform geometric feature extraction and two-dimensional damage feature matrix construction, the loss model can learn the combined influence of load fluctuation and cycle number on fatigue loss, quantify fatigue loss under dynamic load, and improve the accuracy of loss assessment.
[0016] (2) This invention fits the historical loss sequence through the Wiener process model to capture the random characteristics of damage evolution, calculates the remaining lifetime index by integrating the first arrival time probability density function, and integrates the real-time load characteristics and remaining lifetime index by using the joint probability distribution model to realize the quantification of fault risk and avoid the deviation caused by single parameter evaluation.
[0017] (3) This invention drives the update of loss model parameters by incremental learning sample set, and dynamically adjusts the data sampling interval in combination with prediction accuracy to form full-process adaptive closed-loop monitoring, so that the loss model can continuously adapt to the dynamic changes of equipment operating conditions, and improves the long-term reliability and self-learning ability of fault prediction.
[0018] (4) This invention extracts the time-domain statistical features and frequency-domain energy features of the loss growth rate, standardizes and splices them to form a multi-dimensional comprehensive feature vector, and uses support vector machine to construct the optimal decision hyperplane, thereby improving the timeliness and reliability of fault identification. Attached Figure Description
[0019] Figure 1 This is a schematic flowchart of a hardware device fault prediction method provided in the first embodiment of the present invention; Figure 2 This is a schematic diagram of a hardware device fault prediction system provided in the second embodiment of the present invention. Detailed Implementation
[0020] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0021] Reference Figure 1 The first embodiment of the present invention provides a hardware device fault prediction method, including the following steps: S11, acquire load intensity and vibration signal, preprocess the load intensity and vibration signal to obtain denoised load data and vibration feature sequence; S12, Based on the denoised load data, calculate the load fluctuation amplitude and cumulative stress cycle to obtain the damage feature matrix, and input the damage feature matrix into the pre-trained loss model to obtain the fatigue loss cumulative value. S13, when the cumulative fatigue loss value does not exceed the preset safety threshold, load monitoring data is acquired, the loss growth rate is calculated based on the load monitoring data, time series features are extracted from the loss growth rate, and the time series features are classified using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; S14, when the operation status classification result is a high-risk state, obtain the historical loss sequence, integrate the operation status classification result and the historical loss sequence according to the pre-trained Wiener process model, and calculate to obtain the remaining lifetime index. S15, acquire real-time load data, calculate risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. S16. Based on the risk assessment report, the vibration characteristic sequence and the cumulative fatigue loss value are fused to update the parameters of the pre-trained loss model to obtain the optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.
[0022] In step S11, load intensity and vibration signals are acquired, and the load intensity and vibration signals are preprocessed to obtain denoised load data and vibration feature sequences, including: The load intensity and vibration signal are acquired, and the load intensity is low-pass filtered to obtain denoised load data. The vibration signal is subjected to wavelet transform to obtain a pure vibration signal, and feature parameters are extracted from the pure vibration signal to generate a vibration feature sequence.
[0023] It should be noted that load intensity refers to the magnitude of the force exerted on the equipment during operation, such as tension, compression, and torque, which is collected by a pressure sensor; vibration signal refers to the acceleration signal generated when the equipment vibrates, collected by a triaxial accelerometer. Both directly reflect the operating status of the equipment. Low-pass filtering uses a Butterworth filter to remove high-frequency noise from the load signal, such as electromagnetic interference and sensor noise, ultimately obtaining stable, denoised load data that can identify peaks and troughs. Wavelet transform uses the db4 wavelet basis function for three-level decomposition, separating invalid noise from effective vibration components in the vibration signal to obtain a clean, interference-free vibration signal.
[0024] It is worth noting that the characteristic parameters of the pure vibration signal include amplitude, frequency, and energy distribution. The vibration signal acquisition time is divided into multiple 1-second time windows. The amplitude is the maximum amplitude of the pure vibration signal within the time window, measured in millimeters per second. It is obtained by finding the absolute maximum value of the vibration signal waveform within that time window. The frequency is the peak frequency of the effective frequency band after the pure vibration signal undergoes Fourier transform. The effective characteristic range is from 5Hz to 500Hz. This range is determined by analyzing the fault characteristic frequencies of core components such as industrial gearboxes and bearings, such as the meshing frequency calculated from the transmission ratio and rotational frequency, the bearing throughput frequency, and analyzing their harmonic components. This effectively covers the frequency bands corresponding to their main fault modes. The energy distribution is the proportion of energy in each frequency band to the total energy. Each frequency band is divided into three specific intervals: low frequency band (5Hz to 50Hz), corresponding to equipment foundation vibration and shaft imbalance faults; mid frequency band (50Hz to 200Hz), corresponding to gear meshing and bearing rolling element faults; and high frequency band (200Hz to 500Hz), corresponding to bearing inner and outer ring defects and gear wear faults. The amplitude, frequency, and energy distribution of the time window are spliced into a five-dimensional feature vector. The five-dimensional feature vectors of all time windows are sorted according to the acquisition time to form a vibration feature sequence. Specifically, the amplitude scalar, peak frequency scalar, and the three energy proportion scalars of low frequency, mid frequency, and high frequency bands are combined into a five-dimensional feature vector in the order of [amplitude, peak frequency, low frequency energy proportion, mid frequency energy proportion, high frequency energy proportion].
[0025] For example, for an industrial gearbox, load intensity and vibration signals during operation are collected using pressure sensors and triaxial accelerometers. A 4th-order Butterworth low-pass filter is applied to the load intensity, with a cutoff frequency of 100Hz. After convolution, high-frequency interference above 100Hz is filtered out, resulting in stable, denoised load data. The vibration signal is decomposed into three levels using the db4 wavelet basis function. Based on the statistical characteristics of the noise components of the vibration signal, a detail coefficient threshold of 0.05 is set; this threshold can be adjusted according to the signal-to-noise ratio. Values below 0.05 in the detail coefficient are set to zero, and a clean vibration signal is obtained after inverse transformation. A 1-second time window is set, and feature parameters are extracted window by window from the clean vibration signal. The amplitude is found to be 2.5 mm / s, with frequencies concentrated in the range of 20Hz to 30Hz. In the energy distribution, the 20Hz to 30Hz frequency band belongs to the low-frequency band of 5Hz to 50Hz, corresponding to equipment foundation vibration and shaft imbalance faults, accounting for 68%. The amplitude, frequency, and energy distribution of each window are concatenated into a five-dimensional feature vector. The feature vectors of all windows are arranged in chronological order to generate a vibration feature sequence.
[0026] In step S12, based on the denoised load data, the load fluctuation amplitude and cumulative stress cycle are calculated to obtain a damage feature matrix. This damage feature matrix is then input into a pre-trained loss model to obtain the cumulative fatigue loss value, including: The waveform geometric features of the denoised load data are analyzed, and the amplitude difference between adjacent peaks and troughs is calculated based on the waveform geometric features to obtain the load fluctuation amplitude. The cumulative force cycle is calculated based on the waveform geometric features. The load fluctuation amplitude is paired with the cumulative stress cycle to obtain a cyclic feature vector. All the cyclic feature vectors are concatenated to obtain a damage feature matrix. The damage feature matrix is input into the pre-trained loss model, and the cumulative fatigue loss value is output through nonlinear regression calculation.
[0027] It should be noted that waveform geometric feature analysis involves scanning the complete time-series waveform of the denoised load data to identify peaks and troughs. Specifically, a sliding window is used for scanning and identification, with the window size set to 50 sampling points and the step size to 10 sampling points. A single complete stress cycle is the entire process from one trough to the next. Load fluctuation amplitude is the load data borne by the equipment in a single stress cycle, directly reflecting the degree of fatigue damage caused by a single cycle. The larger the load fluctuation amplitude, the more significant the fatigue damage caused by a single cycle. This parameter is calculated by subtracting the load data of the corresponding trough from the load data of the peak. The overall damage to the equipment gradually increases with the number of stress cycles. The cumulative stress cycle is a parameter that measures the cumulative fatigue loss of the equipment. It is obtained by counting the total number of complete stress cycles in the denoised load data, obtaining the cumulative stress cycle, and assigning a unique and incremental cycle number to each complete cycle.
[0028] It should be noted that the cyclic feature vectors are obtained by pairing the load fluctuation amplitude corresponding to a single complete stress cycle with the cycle number. A single cyclic feature vector contains two core pieces of information: the load fluctuation amplitude and the cycle number. Vector concatenation arranges all single cyclic feature vectors in ascending order of cycle number to form a damage feature matrix. The damage feature matrix is a two-dimensional matrix. The number of rows represents the cumulative stress cycles, and the number of columns is fixed at 2. The parameter in the first column is the load fluctuation amplitude corresponding to each complete cycle, and the parameter in the second column is the cycle number corresponding to each complete cycle, incrementing from 1 sequentially according to the order in which the cycles occur. Each row in the matrix corresponds to a cyclic feature vector of a complete stress cycle.
[0029] It is worth noting that the pre-trained loss model is constructed using the random forest regression algorithm. The training process involves taking the damage feature matrix as input and the historical fatigue loss values actually generated by the equipment under the corresponding cyclic conditions as output. The historical fatigue loss values are obtained by collecting data on the actual damage levels corresponding to various load fluctuations within the lifecycle of the same type of equipment. 2 For the coefficient of determination and the goodness of fit of the model, when R... 2 Model training is stopped when the R² is greater than or equal to 0.9 or the number of iterations reaches 500. When R² is greater than or equal to 0.9, the model has high fitting accuracy and a strong correlation between model predictions and actual losses. Setting 500 iterations is used to balance training efficiency and convergence. Most models of the same type can reach the accuracy requirements within this number of iterations, avoiding excessive iterations that waste computing power.
[0030] The specific process of nonlinear regression to calculate the cumulative fatigue loss value is as follows: multiple independent decision trees are started in parallel within the model. Each decision tree uses the load fluctuation amplitude and cycle number in the damage feature matrix as input features. Based on the nonlinear mapping relationship between load fluctuation, cycle number and fatigue loss learned in its own training phase, it outputs the fatigue loss prediction value of a single decision tree. After all decision trees have completed their predictions, the arithmetic mean of the prediction values output by all decision trees is calculated to obtain the final output cumulative fatigue loss value.
[0031] For example, subsequent calculations are performed on the denoised load data of an industrial machine. The complete time series of the denoised load data is scanned, and all peak and trough data points are identified. In a complete cycle, the peak load data is 480 Newtons, and the corresponding trough load data is 180 Newtons. The load fluctuation amplitude of this cycle is calculated to be 300 Newtons, and the cycle is assigned the sequence number 1, forming a cycle feature vector [300, 1]. All complete stress cycles are processed in the same way. After continuous monitoring for 8 hours, a total of 720 complete stress cycles are identified, with a cumulative stress cycle of 720 times, corresponding to 720 cycle feature vectors. All cycle feature vectors are concatenated in ascending order of cycle number to construct a 720-row, 2-column damage feature matrix. The first column of the matrix represents the load fluctuation amplitude of each cycle, and the second column represents the cycle number from 1 to 720. The damage feature matrix is input into the pre-trained loss model. Multiple decision trees within the model output their respective predicted values in parallel. After taking the arithmetic mean of all predicted values, the final output fatigue loss cumulative value of 0.0085 is obtained. This value indicates that the equipment has consumed 0.85% of its expected fatigue life, which can be used as the basis for subsequent estimation of the equipment's remaining life.
[0032] In step S13, based on the load monitoring data, the loss growth rate is calculated, time-series features are extracted from the loss growth rate, and a preset support vector machine classifier is used to classify the time-series features to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status, including: Differential calculations are performed on the load monitoring data to obtain a differential sequence, and the differential sequence is smoothed to form a loss gradient vector; The loss gradient vector is mapped to a dynamic environmental state space consisting of the load mean, fluctuation standard deviation, and temperature deviation, and the loss growth rate is obtained by fitting the data using a multivariate regression method. The loss growth rate is organized into a data sequence, and the data sequence is truncated using a preset sliding window. The mean, variance, and peak statistics of the data within each sliding window are calculated to construct a time-domain statistical feature set. The data sequence of the loss growth rate is subjected to discrete wavelet transform and decomposed into sub-bands at multiple scales. The energy coefficients of each sub-band are extracted. The time series features of the loss growth rate are the set of time-domain statistical features and the set of energy coefficients. The time-domain statistical feature set and the energy coefficient are standardized and then concatenated in a preset order to form a multi-dimensional comprehensive feature vector. The multidimensional integrated feature vector is input into a preset support vector machine classifier for nonlinear transformation to determine the optimal decision hyperplane. Calculate the geometric distance from the multidimensional integrated feature vector to the optimal decision hyperplane, and output the operation state classification result based on the sign attribute of the geometric distance; wherein, the operation state classification result includes normal operation state and high-risk state.
[0033] It should be noted that the load monitoring data is acquired only when the cumulative fatigue loss of the equipment does not exceed a preset safety threshold. This preset safety threshold is determined based on statistical analysis of a large amount of historical fatigue life test data from similar equipment, combined with the equipment's factory-calibrated fatigue limit damage value. For example, 0.025 corresponds to 2.5% of the equipment's expected fatigue life. The load monitoring data includes a load intensity sequence arranged chronologically, the corresponding acquisition timestamp, and the synchronously acquired real-time operating temperature of the equipment. The acquisition timestamp marks the specific moment the data was generated, and the real-time temperature is synchronously acquired through a temperature sensor, reflecting both equipment load changes and real-time operating conditions.
[0034] Differential calculation involves performing adjacent sampling point operations on ordered load monitoring data. Specifically, the load data of the next sampling point is subtracted from the load data of the previous sampling point, and the results of all adjacent sampling points are summed to form a differential sequence. Smoothing of the differential sequence is achieved using a moving average. The sliding window size is set to 5 sampling points, and the window is slid sequentially over time. The arithmetic mean of the differential sequence within each window is calculated, and this mean is used to replace the differential sequence element corresponding to the center position of the window, completing the full sequence smoothing process to filter out instantaneous fluctuation noise.
[0035] It should be noted that the dynamic environmental state space consists of three dimensions: load mean, standard deviation of fluctuation, and temperature deviation. The load mean is the arithmetic mean of the current 24-hour load monitoring data; the standard deviation of fluctuation reflects the dispersion of the current load monitoring data from the load mean; and the temperature deviation is the difference between the equipment's real-time operating temperature and its rated operating temperature. This state space characterizes the equipment's operating environment from multiple dimensions, making the loss growth rate calculation more closely reflect real-world scenarios and improving calculation accuracy. The loss gradient vector mapping uses the load mean, standard deviation of fluctuation, and temperature deviation calculated throughout the entire analysis period as a unified environmental condition feature for that period. This is correlated with the loss gradient vector calculated concurrently for subsequent multivariate regression analysis to establish a quantitative relationship between environmental conditions and the loss gradient.
[0036] It should be noted that the loss growth rate was obtained by fitting the data using a multiple linear regression method. The specific process involves determining the core equation of the multiple linear regression, using the loss gradient vector as the dependent variable, and the load mean, standard deviation of fluctuation, and temperature deviation as the three independent variables. The regression equation is then set as follows: , where y is an element of the loss gradient vector. For load average, For the standard deviation of fluctuation, For temperature deviation, , , Here, represents the regression coefficients of each independent variable, and b is the constant term. The least squares method is then used to solve for the regression coefficients and the constant term. The result is obtained by minimizing the sum of squared deviations between the predicted and actual values of the dependent variable. , , The optimal solution for b is obtained, and then the fitting effect is verified by calculating the goodness of fit R². When R² is greater than or equal to 0.9, the fitting is considered effective. If R² is less than 0.9, historical data under the same working conditions are added, and the parameters are solved again. Finally, the independent variables at each sampling time are substituted into the fitted regression equation to calculate the predicted loss gradient value at each time. The arithmetic mean of all predicted values is taken to obtain the loss growth rate under the current working condition.
[0037] It is worth noting that the loss growth rate data sequence is formed by organizing the loss growth rates in all windows according to the order of acquisition. The preset sliding window size is determined based on the statistical analysis of loss fluctuation patterns of similar equipment. The example window length is 24 hours, and the sliding step size is 6 hours, which can completely capture the local fluctuation patterns of loss growth. The discrete wavelet transform decomposition scale is set to 3 levels, decomposing the loss growth rate data sequence into sub-bands of three scales: high frequency, mid frequency, and low frequency. The high frequency sub-band corresponds to sudden loss changes caused by equipment shocks, the mid frequency sub-band corresponds to fluctuations under normal operating conditions, and the low frequency sub-band corresponds to long-term loss trends. The energy coefficient of each sub-band is the arithmetic mean of the sum of squares of the signal values within the sub-band, used to quantify the signal strength of sub-bands at different scales. The time series feature of the loss growth rate is the set of time-domain statistical features and the energy coefficients of each sub-band. The time-domain statistical feature set and energy coefficients are standardized using the min-max standardization method, mapping all feature values to the interval between 0 and 1, eliminating the dimensional differences between different features. Specifically, the calculation method is to subtract the minimum feature value from the current feature value and then divide by the difference between the maximum and minimum feature values.
[0038] It is worth noting that min-max standardization is applied to the time-domain statistical feature set and the energy coefficients of each sub-band, mapping all feature values to the interval between 0 and 1 to eliminate dimensional differences. The preset splicing order is time-domain statistical feature set first, followed by energy coefficients of each sub-band. After splicing in this order, a multi-dimensional comprehensive feature vector is formed. The preset support vector machine classifier is a classification model trained on historical data, specifically used to identify the operating status of equipment. Its kernel function is selected as radial basis function, the penalty coefficient is set to 1.2, and the slack variable is set to 0.1. These presets are statistical results of historical operating status classification data of similar equipment. After multiple experiments, this parameter combination can balance the classification accuracy and generalization ability of the model. Nonlinear transformation maps the multi-dimensional comprehensive feature vector to a high-dimensional feature space through the radial basis function. The determination of the optimal decision hyperplane takes maximizing the margin between the two classes of samples, namely, the normal state and the high-risk state, as the core objective. Combined with the preset penalty coefficient and slack variable, the classification loss is minimized during training, while constraining the margin between the two classes of samples to the hyperplane to be maximized. Finally, the normal vector of the hyperplane is obtained, thus completing the determination of the optimal decision hyperplane.
[0039] It is worth noting that the geometric distance from the multidimensional integrated feature vector to the optimal decision hyperplane is calculated as follows: substitute the multidimensional integrated feature vector into the mathematical expression of the optimal decision hyperplane, add the absolute value of the hyperplane intercept to the inner product of the vector and the hyperplane normal vector, and then divide by the magnitude of the hyperplane normal vector. The result is the geometric distance. The sign of the distance result is determined by the sign of the inner product of the vector and the hyperplane normal vector plus the intercept. The sign attribute corresponds to a fixed classification rule: a positive sign corresponds to the normal operation state, and a negative sign corresponds to the high-risk state. The larger the absolute value of the geometric distance, the higher the confidence of the classification result.
[0040] For example, for an industrial gearbox, its current cumulative fatigue loss value is 0.018, which does not exceed the preset safety threshold of 0.025, and load monitoring data acquisition is then initiated. Load monitoring data for the equipment over the past 24 hours is collected at a sampling frequency of once per minute, including 1440 sets of real-time load data, acquisition timestamps, and synchronous temperature data, forming complete load monitoring data. Differential calculation is performed on the real-time load data in this load monitoring data, subtracting the load data of the previous sampling point from the load data of the next sampling point, resulting in a differential sequence containing 1439 elements. The differential sequence is then smoothed using a moving average method with 5 sampling points to filter out instantaneous fluctuation noise, forming a loss gradient vector. Most elements of this vector are distributed in the range [-2.5, 3.1]. During a certain period, the equipment experiences a sudden impact, and some elements jump to 12.7, reflecting a sudden acceleration in loss during that period. Next, the parameters of the dynamic environment state space were calculated. The average load was 350 Newtons, the standard deviation of the fluctuation was 42 Newtons, the real-time operating temperature of the equipment was 65℃, the rated operating temperature was 60℃, and the temperature deviation was 5℃.
[0041] Then, the loss gradient vector is matched to these three parameters in terms of dimensionality, and a multiple linear regression method is used for fitting. First, the regression equation is set. Let the elements of the loss gradient vector be y, and the load mean be 350. The standard deviation of the fluctuation is 42. Temperature deviation 5 The solution obtained by the least squares method is =0.0000011、 =0.0000023、 =0.000015, b=0.00008, verifying the goodness of fit R²=0.94 meets the requirements. Then, substitute each parameter into the equation to calculate the predicted loss gradient value at each time point. After taking the arithmetic mean of all predicted values, the loss growth rate under the current working condition is found to be 0.00042 per cycle, which is more than double the baseline working condition of 0.00019. The loss growth rate is organized into an ordered data sequence. A preset sliding window with a length of 24 hours and a step size of 6 hours is used to extract the data. Within a certain window, the mean is calculated to be 0.0003, the variance is 0.00005, and the peak value is 0.00045. A time-domain statistical feature set is constructed. Then, a 3-level discrete wavelet transform is performed on the data sequence to decompose it into three sub-bands: high frequency, mid frequency, and low frequency. The energy coefficients of the high frequency sub-band are extracted to be 2.3, the mid frequency sub-band to be 1.2, and the low frequency sub-band to be 0.8. The high frequency coefficient is much higher than the baseline value of 1.0, reflecting the existence of local accelerated damage. The 1.0 baseline value was determined based on experimental statistics. Normal operation data of 100 similar devices were selected, their high-frequency subband energy coefficients were extracted and the statistical mean was calculated to be 0.95. The value was set to 1.0, and after multiple verifications, its misjudgment rate was less than 5%.
[0042] The time-domain statistical feature set and energy coefficients are min-max standardized and concatenated in the order of time-domain features first and energy coefficients last to form a multi-dimensional comprehensive feature vector. This vector is input into a preset support vector machine classifier and transformed into a high-dimensional space through a radial basis function kernel function. By maximizing the interval between the two classes of samples using preset parameters, the normal vector and intercept parameters of the optimal decision hyperplane are determined. Then, the geometric distance is calculated using the geometric distance formula, and the geometric distance from the multi-dimensional comprehensive feature vector to the optimal decision hyperplane is found to be -1.5. Based on the sign attribute, the operating status is classified as a high-risk state, indicating potential damage to the equipment, and subsequent maintenance warning and model parameter optimization processes need to be initiated.
[0043] In step S14, when the operational status classification result indicates a high-risk state, a historical loss sequence is obtained. Based on a pre-trained Wiener process model, the operational status classification result and the historical loss sequence are integrated and calculated to obtain a remaining lifetime index, including: When the operation status classification result is a high-risk state, the unique device identifier of the current hardware device to be monitored and the timestamp information corresponding to the high-risk state are obtained, and the historical fatigue loss accumulation value within a preset time span is filtered in the preset device loss database. The historical loss sequence is input into the pre-trained Wiener process model to obtain the drift coefficient and diffusion coefficient; Based on the drift coefficient and the diffusion coefficient, derive the time probability density function of the time required for the cumulative fatigue loss value to first reach the preset failure threshold; Integrating the time probability density function yields the remaining lifetime index.
[0044] It should be noted that the unique device identifier of the currently monitored hardware device is the exclusive code assigned when the device leaves the factory, or the unique identification number assigned by the system when the device enters the network. This identifier corresponds one-to-one with the device and can be directly retrieved from the device's basic information storage module. The timestamp information corresponding to the high-risk state is the current time information automatically synchronized and marked when the preset support vector machine classifier outputs the high-risk state result, used to define the trigger time of the high-risk state. The preset equipment loss database is a database that stores the cumulative fatigue loss values of each device throughout its entire life cycle. The database is classified and stored according to the unique device identifier. Each loss data is associated with the corresponding collection timestamp. The specific process of filtering the historical fatigue loss cumulative values within the preset time span is as follows: first, the loss data directory of the corresponding device is matched in the database by the retrieved unique device identifier; then, based on the high-risk state timestamp, all loss data within the preset time span is traced back. The preset time span is determined based on the statistical analysis of the loss evolution cycle of the same type of equipment, which can completely cover the key stages of the loss growth pattern, such as 180 days. After filtering, the data is sorted from earliest to latest by collection timestamp to form an ordered historical loss sequence.
[0045] It should be noted that the drift coefficient describes the average growth rate of equipment fatigue loss, with units of dimensionless days. A larger value indicates a faster average accumulation rate of equipment loss, reflecting the overall trend of loss evolution. The diffusion coefficient reflects the intensity of random fluctuations in the growth process of equipment fatigue loss, with units of dimensionless square root per day. A larger value indicates a more significant impact of random factors such as operating condition fluctuations and environmental disturbances on the accumulation of equipment loss. The training process of the pre-trained Wiener process model involves using historical loss sequences of similar equipment as input, estimating the corresponding drift and diffusion coefficients from the input sequences, and training the Wiener process model. During fitting calculations, a likelihood function is first constructed based on the core formula of damage evolution in the Wiener process, combined with the time intervals and corresponding loss values of the historical loss sequences. The optimal drift and diffusion coefficients are then obtained by solving for the maximum value of the likelihood function. The preset iteration termination condition for training is that the sum of squared residuals of the model fitting is less than 0.0001 or the number of iterations reaches 300; model training stops when either condition is met. The termination condition is set based on common knowledge in the field and experimental statistics. A sum of squared residuals less than 0.0001 is a standard accuracy requirement for similar Wiener process models, ensuring that the fitting errors of the drift coefficient and diffusion coefficient are within an acceptable range. Statistical training with data from similar equipment shows that over 95% of the models can meet the residual requirement within 300 iterations, avoiding wasted computing power due to excessive iterations. The process involves inputting the historical loss sequence of the current monitored equipment into the trained Wiener process model to obtain the drift coefficient and diffusion coefficient corresponding to the current equipment's loss evolution.
[0046] It is worth noting that the specific form of the time probability density function for the time required for the cumulative fatigue loss value to first reach the preset failure threshold is an inverse Gaussian distribution. Its derivation process is based on the core equation of stochastic damage evolution of the Wiener process. The obtained drift coefficient and diffusion coefficient are substituted into the equation. At the same time, the current cumulative fatigue loss value and the preset failure threshold of the equipment are defined. The time when the loss first reaches the preset failure threshold is used as a variable to derive the mapping relationship between time and the corresponding probability density, and finally form the time probability density function.
[0047] The preset failure threshold is determined using a dual approach: combining industry-wide common knowledge with experimental statistics. Referencing the fatigue damage assessment standards of the equipment's industry, the critical value for cumulative fatigue loss when irreversible damage occurs to the core structure of the equipment is identified. Based on experimental statistics, numerous accelerated fatigue life tests are conducted on similar equipment, and the cumulative fatigue loss values at functional failure are statistically analyzed for all tested equipment. The median value of the statistical results within the concentrated range is taken, and then slightly adjusted for actual operating conditions. For example, for industrial gearbox equipment, the industry-wide common knowledge is the relevant standard GB / T3480 for gear transmission, which specifies a fatigue limit damage value of 0.4 when irreversible pitting or wear occurs on the gear tooth surface. Experimental statistics involve selecting 100 gearboxes of the same model for accelerated fatigue tests. The cumulative loss values at failure are concentrated in the range of 0.38 to 0.42. The median value of 0.4 is taken, and after adjustment for actual load fluctuations in the industrial environment, the preset failure threshold is finally determined to be 0.4.
[0048] This function can reflect the probability distribution of the time required for equipment wear to first reach the failure threshold from the current state, and also the probability distribution characteristics of the remaining lifespan of the equipment.
[0049] It's worth noting that the specific process of integrating the time probability density function involves performing a definite integral operation on the time probability density function, using the current moment as the lower limit of integration and infinity as the upper limit. This yields core indicators such as the median remaining lifetime, average remaining lifetime, and the 90% confidence lower limit, which together constitute the remaining lifetime metric. This integration operation transforms the probability distribution information of the time probability density function into a quantifiable remaining lifetime reference indicator that can directly guide operations and maintenance. The median reflects the most probable range of remaining lifetime, the average remaining lifetime reflects the overall average level, and the confidence lower limit reflects the minimum guaranteed time of remaining lifetime, providing multi-dimensional and accurate basis for maintenance decisions.
[0050] For example, after classification in step S13, an industrial gearbox is output as having a high-risk operating status. First, the unique equipment identifier code of the gearbox is retrieved, and the timestamp corresponding to the high-risk status is obtained as the 720th day and 15:30 of equipment operation. Based on this unique equipment identifier, corresponding data is matched in a preset equipment loss database. Following a preset time span of 180 days, the historical fatigue loss cumulative values from the 540th day to the 720th day of equipment operation are filtered out. These values are sorted by time to form a historical loss sequence. This sequence contains 180 data points, collected at 1-day intervals. The values gradually increase from an initial 0.01 to the current 0.28, showing a slow but accelerating growth trend. The historical loss sequence was input into a pre-trained Wiener process model. The drift coefficient was calculated using a fitting method. First, the loss difference between adjacent days was calculated, and the arithmetic mean of all differences was taken as 0.0012, with a one-day time interval. The final drift coefficient was 0.0012 per day. Then, the diffusion coefficient was calculated by summing the squares of the deviations of the daily loss difference from 0.0012 dimensionless / day, dividing by 180 sampling times and a one-day time interval, resulting in a diffusion coefficient of 0.00018 dimensionless / day square root. The drift coefficient indicates that the gearbox loss increases by an average of 0.0012 units per day, while the diffusion coefficient indicates a small random fluctuation in loss growth. The preset failure threshold for this gearbox is 0.4, set based on the known gear industry standard GB / T3480, combined with the statistical results of accelerated fatigue tests on 100 gearboxes of the same model.
[0051] Substituting the drift coefficient, diffusion coefficient, current loss value of 0.28, and failure threshold of 0.4 into the Wiener process equation, the time probability density function required for the loss to first reach 0.4 is derived. This function exhibits a right-skewed distribution, with the peak value corresponding to approximately 95 days, indicating that the remaining lifetime is likely concentrated around this time. By integrating this time probability density function from the current moment to infinity and calculating the cumulative probability at different time points, the remaining lifetime index is finally obtained. The median remaining lifetime is 92 days, the average remaining lifetime is 108 days, and the 90% confidence lower limit is 65 days. This remaining lifetime index quantifies the time window from the gearbox's current state to potential failure and can be directly used for subsequent failure risk assessment and maintenance planning.
[0052] In step S15, real-time load data is acquired, and a risk probability is calculated based on the real-time load data and the remaining lifespan indicator. When the risk probability exceeds a preset risk threshold, a preventative maintenance signal is generated; otherwise, the monitoring frequency is adjusted, and a risk assessment report is generated, including: Acquire real-time load data and extract the load fluctuation characteristics of the real-time load data; The load fluctuation characteristics are fused with the remaining life index to construct an operating status feature vector. The operating status feature vector is then input into a pre-trained joint probability distribution model to obtain the risk probability. Determine whether the risk probability exceeds a preset risk threshold. If yes, generate a preventive maintenance signal; otherwise, reset the monitoring frequency based on the risk probability to obtain the reset monitoring frequency. The remaining lifespan index is mapped to a preset failure risk assessment matrix, and a quantified failure risk assessment level is output. The system obtains the historical maintenance signal trigger frequency, calculates the system stability level value based on the preventive maintenance signal and the reset monitoring frequency, and generates a risk assessment report based on the system stability level value, the risk probability, and the fault risk assessment level.
[0053] It should be noted that the specific method for extracting load fluctuation characteristics is as follows: first, abnormal noise data is removed, and then the standard deviation of fluctuation amplitude, peak factor, and number of short-term mutations are extracted. The standard deviation of fluctuation amplitude is obtained by calculating the sum of squared deviations between the real-time load data and the load mean, dividing by the total number of data points, and then taking the square root; this reflects the overall dispersion of load fluctuations. The peak factor is obtained by calculating the ratio of the maximum value in the real-time load data to the load mean; this reflects the degree of deviation of the instantaneous load peak from the average level. The number of short-term mutations is obtained by setting a load mutation threshold and counting the number of times the real-time load data exceeds this threshold per unit time; this reflects the frequency of load mutations. The load mutation threshold is determined by combining industry-standard knowledge with experimental statistics. For example, using 1.3 times the rated load of the equipment as a benchmark reference value, and selecting the normal operating load data of 100 similar devices, the 99th percentile of their load fluctuation is calculated to be 1.28 times the rated load; the average of these two values, 1.29 times, is taken as the initial threshold. The process of fusing load fluctuation characteristics and remaining lifetime indicators, and constructing the operational status feature vector, involves first performing max-min standardization on the three extracted load fluctuation characteristics and remaining lifetime indicators, mapping all feature values to the range of 0 to 1. Then, the standardized load fluctuation characteristics and remaining lifetime indicators are concatenated sequentially in a preset order to form the operational status feature vector. The complete preset order is: standard deviation of fluctuation amplitude, peak factor, number of short-term mutations, median remaining lifetime, average remaining lifetime, and 90% confidence lower limit.
[0054] It should be noted that the specific type of the pre-trained joint probability distribution model is the Copula joint probability distribution model. The training process involves using historical load fluctuation characteristics and historical remaining life indicators of similar equipment as input. Supervision labels are assigned to whether the equipment failed under corresponding historical operating conditions, with failures marked as 1 and normal operation as 0. This process trains the joint probability distribution model to ensure that the model's output risk probability is highly correlated with historical failure scenarios. The preset iteration termination condition during training is either that the average absolute error between the model's predicted risk probability and the actual failure event marker is less than 0.03, or that the number of iterations reaches 200. Training stops when either condition is met. The usage process involves concatenating the standardized load fluctuation characteristics and remaining life indicators of the current equipment to form an operating state feature vector, which is then input into the trained joint probability distribution model to obtain the real-time risk probability under the current operating conditions.
[0055] It's worth noting that the preventative maintenance signal is an operation and maintenance trigger signal generated when the risk probability exceeds a preset risk threshold. It includes core information such as the device's unique identifier, signal trigger time, current risk probability, and remaining lifespan indicators, used to alert maintenance personnel to a high risk of equipment failure. The preset risk threshold is confirmed based on experimental statistics. Long-term historical operation and maintenance data from 100 similar devices were collected to record the actual failure rate under different risk probabilities, and the threshold was determined by the statistical results. For example, experimental statistics showed that when the risk probability exceeded 0.65, the failure rate of the device increased from 12% to 48% within 30 days, while when the risk probability was below 0.65, the failure rate was below 15%. Therefore, 0.65 was determined as the preset risk threshold.
[0056] It is worth noting that the specific method for resetting the monitoring frequency is as follows: three risk probability intervals are preset, each corresponding to a different monitoring frequency level. The interval range is set based on engineering practice. When the risk probability is below 0.3, the monitoring frequency is reset to once every 4 hours; when the risk probability is between 0.3 and 0.65, the monitoring frequency is reset to once every 2 hours; and when the risk probability is above 0.65 but does not exceed a preset threshold, the monitoring frequency is reset to once every 1 hour. The reset monitoring frequency is determined according to the current risk probability interval. The preset fault risk assessment matrix is a two-dimensional matrix. The rows of the matrix represent the median remaining lifespan interval, and the columns represent the real-time risk probability interval. The row intervals are determined based on the statistical distribution of remaining lifespan of similar equipment, divided into four levels: less than 30 days, 31 to 90 days, 91 to 180 days, and greater than 180 days. The column intervals are divided into three levels: less than 0.3, 0.3 to 0.65, and greater than 0.65. The parameters in the matrix represent quantitative fault risk assessment levels, categorized into five levels: extremely low risk, low risk, medium risk, high risk, and extremely high risk. These parameters are obtained based on well-known equipment risk assessment standards in the field, combined with historical operation and maintenance data and fault statistics of a large number of similar equipment, and have been verified under multiple operating conditions.
[0057] It is worth noting that the system stability level is calculated by selecting three core parameters: historical maintenance signal trigger frequency, current risk probability, and monitoring frequency change rate, each assigned a different weight. The monitoring frequency change rate is defined as the difference between the current monitoring frequency and the previous cycle's monitoring frequency, divided by the previous cycle's monitoring frequency, to quantify the dynamic adjustment range of the monitoring frequency. In this example, since the monitoring frequency has not been reset, the change rate is 0. The historical maintenance signal trigger frequency is assigned a weight of 0.4, the current risk probability a weight of 0.3, and the monitoring frequency change rate a weight of 0.3. After standardizing the three parameters, multiplying them by their corresponding weights, and summing them, the system stability level value is obtained. The value ranges from 0 to 1; the closer the value is to 1, the more stable the system operation; the closer the value is to 0, the worse the system operation stability. The risk assessment report is generated by integrating all core data, including system stability level values, real-time risk probability, fault risk assessment level, remaining life indicators, monitoring frequency after reset, and historical maintenance signal trigger frequency, and compiling them in a fixed format. The report specifically includes basic equipment information, current operating status parameters, core risk assessment indicators, and system stability level analysis, which are used to comprehensively present the current risk status of the equipment.
[0058] For example, the remaining life index of a wind turbine gearbox is calculated in step S14, where the median remaining life is 92 days, the average remaining life is 108 days, and the 90% confidence lower limit is 65 days. Real-time load data of the gearbox is obtained, including parameters such as torque, speed, and power output. After preprocessing the real-time load data, load fluctuation characteristics are extracted, resulting in a standard deviation of fluctuation amplitude of 0.12, a peak factor of 1.3, and a short-term mutation frequency of 3 times per hour.
[0059] The three types of load fluctuation characteristics and remaining lifetime indicators are standardized and concatenated in a preset order: standard deviation of fluctuation amplitude, peak factor, number of short-term mutations, median remaining lifetime, average remaining lifetime, and 90% confidence lower limit, to construct an operating status feature vector. This feature vector is input into a pre-trained joint probability distribution model, resulting in a real-time risk probability of 0.78 and a preset risk threshold of 0.65. When the current risk probability exceeds the threshold, a preventive maintenance signal is generated. The signal includes information such as the unique identifier of the gearbox, the trigger time of 15 hours and 30 minutes, the risk probability of 0.78, and the median remaining lifetime of 92 days.
[0060] The median remaining lifespan of 92 days was then mapped to a preset fault risk assessment matrix. 92 days falls within the row range of 91 to 180 days, and the risk probability of 0.78 falls within the column range greater than 0.65, thus determining the fault risk assessment level as high risk. The historical maintenance signal trigger frequency for the gearbox over the past 30 days was obtained as once. Since a preventative maintenance signal is currently being generated, the monitoring frequency does not need to be reset and remains at once per hour. The system stability level was calculated. After standardization, the historical maintenance signal trigger frequency is 0.2, the current risk probability is 0.78, and the monitoring frequency change rate is 0. Substituting these values into the weights, the system stability level is calculated as 0.2 × 0.4 + 0.78 × 0.3 + 0 × 0.3 = 0.314. Based on all the core data mentioned above, a risk assessment report is generated. The report includes basic information such as the equipment model and unique identifier of the wind turbine gearbox, current real-time load fluctuation characteristics, remaining life indicators, risk probability and other operating status parameters, assessment results such as system stability level 0.314 and high risk level, and maintenance suggestions such as completing shutdown and disassembly inspection within one week and preparing spare parts for core transmission components, providing maintenance personnel with comprehensive decision-making reference.
[0061] In step S16, based on the risk assessment report, the vibration characteristic sequence and the cumulative fatigue loss value are fused to update the parameters of the pre-trained loss model, resulting in an optimized loss model. This optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system, including: The abnormal state markers in the risk assessment report are analyzed, and the vibration feature sequence and fatigue loss cumulative value corresponding to the abnormal state markers are extracted to construct an incremental learning sample set. The incremental learning sample set is input into the pre-trained loss model to obtain the model's predicted state value; Calculate the deviation between the model's predicted state value and the abnormal state marker, and generate the model weight gradient vector based on the deviation value; The internal parameters of the loss model are adjusted according to the model weight gradient vector to obtain the optimized loss model. Based on the prediction accuracy of the optimized loss model, the data sampling interval is calculated and written into the acquisition register to drive the sensor to acquire data in a new round, thus forming a closed-loop monitoring.
[0062] It should be noted that the specific process of parsing the abnormal state markers in the risk assessment report is as follows: First, the structured data fields of the risk assessment report are read, and the specific field for the abnormal state marker is located. This field contains four core pieces of information: the time period of the abnormality, the type of abnormal state, the associated equipment location, and the level of abnormality quantification. Then, equipment operation data within the corresponding time range is filtered according to the time period of the abnormality, and the vibration characteristic sequence and cumulative fatigue loss value within that time period are matched to construct an incremental learning sample set. The deviation between the cumulative fatigue loss value predicted by the model and the actual cumulative fatigue loss reference value corresponding to the abnormal state marker is calculated. The abnormal state marker needs to be associated with an actual, typical fatigue loss value that can represent the abnormal state as the optimization target. For example, for a medium-risk state, a sample of historical data with a loss value in the range of 0.3-0.4 is used, and its mean of 0.35 is taken as the reference value. The deviation value is the actual fatigue loss reference value minus the cumulative fatigue loss value predicted by the model.
[0063] It should be noted that the specific method for generating the model weight gradient vector is as follows: Using the loss function of the pre-trained loss model as the core, the loss function employs the mean squared error function. The deviation values are substituted into the loss function, and the partial derivatives of each weight parameter within the model are calculated. These partial derivatives are then combined according to a preset weight order to form the model weight gradient vector. The direction of this vector reflects the adjustment direction of the weight parameters, and the magnitude of the vector reflects the adjustment magnitude. The preset weight order is: feature weight of a single decision tree branch, loss prediction weight of decision tree leaf nodes, and ensemble weight between decision trees. The ensemble weight between decision trees refers to the contribution weight of the prediction results of each independent decision tree in the final ensemble output. The weight values are dynamically allocated based on the prediction accuracy of each decision tree in the historical training data.
[0064] It's worth noting that the specific method for adjusting the internal parameters of the loss model based on the model weight gradient vector is as follows: Gradient descent is used for iterative parameter adjustment. First, a preset learning rate is established, determined based on statistical data from iterative optimization of similar equipment models (example value is 0.01). The learning rate controls the step size of parameter adjustment. The current weight parameter is subtracted from the product of the learning rate and the corresponding element of the gradient vector to obtain the adjusted weight parameter. After one parameter adjustment, the incremental learning sample set is input back into the model to calculate the new deviation value and gradient vector. This adjustment process is repeated. Simultaneously, a preset termination condition is established: the deviation between the model's predicted cumulative fatigue loss value and the actual cumulative loss reference value is less than 0.02, or the number of iterations reaches 100. Parameter adjustment stops when either condition is met, ultimately yielding the optimized loss model. A deviation threshold of 0.02 ensures no substantial deviation in state judgment, and 100 iterations represent a balance between training efficiency and parameter convergence; over 90% of similar models converge within this number of iterations.
[0065] It is worth noting that the data sampling interval is calculated based on the prediction accuracy of the optimized loss model. Taking the Mean Absolute Error (MAE) as an example, the baseline sampling interval is set based on the monitoring requirements during normal equipment operation, for example, 60 minutes. Then, the prediction accuracy range is divided: a prediction accuracy MAE less than 0.02 is considered high accuracy, MAE greater than or equal to 0.02 and less than 0.05 is considered medium accuracy, and MAE greater than or equal to 0.05 is considered low accuracy. Combining the fault risk assessment level in the risk assessment report, the final sampling interval is calculated according to the preset adjustment coefficient. When the optimized loss model is in the high accuracy range and the corresponding fault risk assessment level of the equipment is extremely low or low, the corresponding adjustment coefficient is 1.5. When the optimized loss model is in the medium accuracy range and the corresponding fault risk assessment level of the equipment is medium, the corresponding adjustment coefficient is 1. When the optimized loss model is in the low accuracy range and the corresponding fault risk assessment level of the equipment is high or extremely high, the corresponding adjustment coefficient is between 0.3 and 0.5. The data sampling interval is obtained by multiplying the baseline sampling interval by the corresponding adjustment coefficient.
[0066] For example, an abnormal state marker exists in the risk assessment report of a wind turbine gearbox. The marker indicates that the abnormality occurred between 10:00 and 10:30, the abnormal state type is vibration abnormality, the associated equipment part is the high-speed shaft bearing, and the abnormality quantification level is medium risk (quantification value 0.5). This abnormal state marker is analyzed, and the vibration characteristic sequence (vibration amplitude 0.35 mm, frequency distribution concentrated at 200 Hz) and the cumulative fatigue loss value of 0.32 for the corresponding time period are selected and linked to construct an incremental learning sample set. This incremental learning sample set is input into a pre-trained loss model. The predicted fatigue loss value for this time period is 0.34. Based on the abnormal state marker being medium risk, and using historical statistics, the typical fatigue loss reference value for this state is set to 0.35. The calculated deviation is 0.35 - 0.34 = 0.01. The incremental learning sample set is substituted into the mean squared error loss function to calculate the mean squared error loss of the model on the entire sample set. Then, the partial derivative of this loss function with respect to the weight parameters within the model is calculated, and the combined values generate the model weight gradient vector.
[0067] The gradient descent method was used to adjust the model's internal parameters. The preset learning rate was 0.01, and the iteration termination condition was a deviation less than 0.02 or 100 iterations. After 35 iterations, the deviation between the model's predicted state value and the abnormal state marker decreased to 0.018, meeting the termination condition, resulting in the optimized loss model. The mean absolute error (MAE) of the optimized model on the test set was calculated to be 0.025, which falls within the medium accuracy range (0.02 ≤ MAE < 0.05). Considering the fault risk assessment level was medium risk, an adjustment coefficient of 1 was set, with a baseline sampling interval of 60 minutes. The final sampling interval was 60 × 1 = 60 minutes. Since the model accuracy improved and the risk did not escalate, the baseline interval was maintained. The data sampling interval value was written to the acquisition register, driving the vibration sensor and load sensor at the high-speed shaft bearing to acquire data at a new round of data acquisition at 60-minute intervals. After data acquisition, the data was again input into the optimized loss model for analysis, forming a closed-loop monitoring mechanism.
[0068] Reference Figure 2 The second embodiment of the present invention provides a hardware device fault prediction system, comprising: The data preprocessing module is used to acquire load intensity and vibration signal, preprocess the load intensity and vibration signal to obtain denoised load data and vibration feature sequence; The loss calculation module is used to calculate the load fluctuation amplitude and cumulative stress cycle based on the denoised load data, obtain the damage feature matrix, and input the damage feature matrix into the pre-trained loss model to obtain the cumulative fatigue loss value. The status classification module is used to acquire load monitoring data when the cumulative fatigue loss value does not exceed a preset safety threshold, calculate the loss growth rate based on the load monitoring data, extract time series features from the loss growth rate, and classify the time series features using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; The lifetime calculation module is used to obtain the historical loss sequence when the operation status classification result is a high-risk state, and to integrate the operation status classification result and the historical loss sequence according to the pre-trained Wiener process model to calculate the remaining lifetime index. The risk assessment module is used to acquire real-time load data, calculate the risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. The closed-loop update module is used to update the parameters of the pre-trained loss model based on the risk assessment report, by integrating the vibration characteristic sequence and the cumulative fatigue loss value, to obtain the optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.
[0069] It should be noted that the hardware device fault prediction system provided in this embodiment of the invention is used to execute all the process steps of the hardware device fault prediction method in the above embodiment. The working principles and beneficial effects of the two are one-to-one, so they will not be described again.
[0070] It should be noted that the system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Furthermore, in the accompanying drawings of the system embodiments provided by this invention, the connection relationships between modules indicate that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines. Those skilled in the art can understand and implement this without any creative effort.
[0071] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. In particular, it should be noted that any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention for those skilled in the art.
Claims
1. A method for predicting hardware device faults, characterized in that, include: Acquire load intensity and vibration signals, preprocess the load intensity and vibration signals to obtain denoised load data and vibration feature sequence; Based on the denoised load data, the load fluctuation amplitude and cumulative stress cycle are calculated to obtain the damage feature matrix. The damage feature matrix is then input into the pre-trained loss model to obtain the cumulative fatigue loss value. When the cumulative fatigue loss value does not exceed the preset safety threshold, load monitoring data is acquired, the loss growth rate is calculated based on the load monitoring data, time series features are extracted from the loss growth rate, and the time series features are classified using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; When the operational status classification result is a high-risk state, the historical loss sequence is obtained. Based on the pre-trained Wiener process model, the operational status classification result and the historical loss sequence are integrated and calculated to obtain the remaining lifetime index. Acquire real-time load data, calculate risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. Based on the risk assessment report, the vibration characteristic sequence and the cumulative fatigue loss value are integrated to update the parameters of the pre-trained loss model, resulting in an optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.
2. The hardware device fault prediction method according to claim 1, characterized in that, The process of acquiring load intensity and vibration signals, preprocessing the load intensity and vibration signals to obtain denoised load data and vibration feature sequences includes: The load intensity and vibration signal are acquired, and the load intensity is low-pass filtered to obtain denoised load data. The vibration signal is subjected to wavelet transform to obtain a pure vibration signal, and feature parameters are extracted from the pure vibration signal to generate a vibration feature sequence.
3. The hardware device fault prediction method according to claim 1, characterized in that, The step involves calculating the load fluctuation amplitude and cumulative stress cycle based on the denoised load data to obtain a damage feature matrix. This damage feature matrix is then input into a pre-trained loss model to obtain the cumulative fatigue loss value, including: The waveform geometric features of the denoised load data are analyzed, and the amplitude difference between adjacent peaks and troughs is calculated based on the waveform geometric features to obtain the load fluctuation amplitude. The cumulative force cycle is calculated based on the waveform geometric features. The load fluctuation amplitude is paired with the cumulative stress cycle to obtain a cyclic feature vector. All the cyclic feature vectors are concatenated to obtain a damage feature matrix. The damage feature matrix is input into the pre-trained loss model, and the cumulative fatigue loss value is output through nonlinear regression calculation.
4. The hardware device fault prediction method according to claim 1, characterized in that, The step of calculating the loss growth rate based on the load monitoring data includes: Differential calculations are performed on the load monitoring data to obtain a differential sequence, and the differential sequence is smoothed to form a loss gradient vector; The loss gradient vector is mapped to a dynamic environmental state space consisting of the load mean, fluctuation standard deviation, and temperature deviation, and the loss growth rate is obtained by fitting the data using a multivariate regression method.
5. The hardware device fault prediction method according to claim 1, characterized in that, The step of extracting time-series features from the loss growth rate and classifying the time-series features using a preset support vector machine classifier to obtain the running state classification result includes: The loss growth rate is organized into a data sequence, and the data sequence is truncated using a preset sliding window. The mean, variance, and peak statistics of the data within each sliding window are calculated to construct a time-domain statistical feature set. The data sequence of the loss growth rate is subjected to discrete wavelet transform and decomposed into sub-bands at multiple scales. The energy coefficients of each sub-band are extracted. The time series features of the loss growth rate are the set of time-domain statistical features and the set of energy coefficients. The time-domain statistical feature set and the energy coefficient are standardized and then concatenated in a preset order to form a multi-dimensional comprehensive feature vector. The multidimensional integrated feature vector is input into a preset support vector machine classifier for nonlinear transformation to determine the optimal decision hyperplane. Calculate the geometric distance from the multidimensional integrated feature vector to the optimal decision hyperplane, and output the operation state classification result based on the sign attribute of the geometric distance; wherein, the operation state classification result includes normal operation state and high-risk state.
6. The hardware device fault prediction method according to claim 1, characterized in that, When the operational status classification result indicates a high-risk state, a historical loss sequence is obtained. Based on a pre-trained Wiener process model, the operational status classification result and the historical loss sequence are integrated and calculated to obtain a remaining lifetime index, including: When the operation status classification result is a high-risk state, the unique device identifier of the current hardware device to be monitored and the timestamp information corresponding to the high-risk state are obtained, and the historical fatigue loss accumulation value within a preset time span is filtered in the preset device loss database. The historical loss sequence is input into the pre-trained Wiener process model to obtain the drift coefficient and diffusion coefficient; Based on the drift coefficient and the diffusion coefficient, derive the time probability density function of the time required for the cumulative fatigue loss value to first reach the preset failure threshold; Integrating the time probability density function yields the remaining lifetime index.
7. The hardware device fault prediction method according to claim 1, characterized in that, The process involves acquiring real-time load data, calculating the risk probability based on the real-time load data and the remaining lifespan indicator, generating a preventative maintenance signal when the risk probability exceeds a preset risk threshold, and otherwise adjusting the monitoring frequency and generating a risk assessment report, including: Acquire real-time load data and extract the load fluctuation characteristics of the real-time load data; The load fluctuation characteristics are fused with the remaining life index to construct an operating status feature vector. The operating status feature vector is then input into a pre-trained joint probability distribution model to obtain the risk probability. Determine whether the risk probability exceeds a preset risk threshold. If yes, generate a preventive maintenance signal; otherwise, reset the monitoring frequency based on the risk probability to obtain the reset monitoring frequency. The remaining lifespan index is mapped to a preset failure risk assessment matrix, and a quantified failure risk assessment level is output. The system obtains the historical maintenance signal trigger frequency, calculates the system stability level value based on the preventive maintenance signal and the reset monitoring frequency, and generates a risk assessment report based on the system stability level value, the risk probability, and the fault risk assessment level.
8. The hardware device fault prediction method according to claim 1, characterized in that, The process involves updating the parameters of the pre-trained loss model based on the risk assessment report, integrating the vibration characteristic sequence and the cumulative fatigue loss value, to obtain an optimized loss model. This optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system. The abnormal state markers in the risk assessment report are analyzed, and the vibration feature sequence and fatigue loss cumulative value corresponding to the abnormal state markers are extracted to construct an incremental learning sample set. The incremental learning sample set is input into the pre-trained loss model to obtain the model's predicted state value; Calculate the deviation between the model's predicted state value and the abnormal state marker, and generate the model weight gradient vector based on the deviation value; The internal parameters of the loss model are adjusted according to the model weight gradient vector to obtain the optimized loss model. Based on the prediction accuracy of the optimized loss model, the data sampling interval is calculated and written into the acquisition register to drive the sensor to acquire data in a new round, thus forming a closed-loop monitoring.
9. A hardware device fault prediction system, characterized in that, include: The data preprocessing module is used to acquire load intensity and vibration signal, preprocess the load intensity and vibration signal to obtain denoised load data and vibration feature sequence; The loss calculation module is used to calculate the load fluctuation amplitude and cumulative stress cycle based on the denoised load data, obtain the damage feature matrix, and input the damage feature matrix into the pre-trained loss model to obtain the cumulative fatigue loss value. The status classification module is used to acquire load monitoring data when the cumulative fatigue loss value does not exceed a preset safety threshold, calculate the loss growth rate based on the load monitoring data, extract time series features from the loss growth rate, and classify the time series features using a preset support vector machine classifier to obtain the operating status classification result; wherein, the operating status classification result includes normal operating status and high-risk status; The lifetime calculation module is used to obtain the historical loss sequence when the operation status classification result is a high-risk state, and to integrate the operation status classification result and the historical loss sequence according to the pre-trained Wiener process model to calculate the remaining lifetime index. The risk assessment module is used to acquire real-time load data, calculate the risk probability based on the real-time load data and the remaining life index, generate a preventive maintenance signal when the risk probability exceeds a preset risk threshold, otherwise adjust the monitoring frequency and generate a risk assessment report. The closed-loop update module is used to update the parameters of the pre-trained loss model based on the risk assessment report, by integrating the vibration characteristic sequence and the cumulative fatigue loss value, to obtain the optimized loss model. The optimized loss model is then used in the next data acquisition process to form a closed-loop monitoring system.