A rotating machinery remaining useful life prediction method based on GTFDAU network
By designing the GTFDAU network and combining transient fluctuation capture and attention mechanisms, we can deeply mine historical information of rotating machinery, solve the problem of insufficient information integration in the RUL prediction of rotating machinery, and achieve higher accuracy and robustness in prediction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHONGQING UNIV OF POSTS & TELECOMM
- Filing Date
- 2023-06-19
- Publication Date
- 2026-06-23
Smart Images

Figure CN116756872B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of rotating machinery monitoring, and specifically relates to a rotating machinery RUL prediction method based on GTFDAU network. Background Technology
[0002] Rotating machinery plays a major role in modern Industry 4.0 production. However, due to prolonged operation in varying environments and conditions, rotating machinery is prone to failure, leading to significant safety accidents and economic losses. Therefore, monitoring degradation and performing RUL (Reliability and Durability) prediction are crucial for ensuring the safety and reliability of rotating machinery.
[0003] Generally, Remaining Life (RUL) prediction methods for rotating machinery can be categorized into model-driven, data-driven, and hybrid approaches. Model-driven methods establish mathematical or physical models based on the equipment's failure mechanisms to describe degradation information; however, this method requires a deep understanding of the equipment's internal structure and mechanistic characteristics, making it quite challenging. Hybrid approaches face similar issues. Data-driven methods, on the other hand, utilize sensors to monitor equipment operation, acquiring a large amount of data reflecting equipment health to build models. This approach eliminates the need to consider mechanical structure, operating conditions, and failure mechanisms, significantly improving the accuracy of remaining life prediction and attracting widespread attention in recent years.
[0004] In essence, RUL prediction is time series prediction, making in-depth mining and learning of temporal features crucial. Deep learning-based recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and gated recurrent units (GRUs) are most widely used in RUL prediction. However, standalone prediction networks suffer from insufficient ability to learn important temporal features. Attention mechanisms, however, can compensate for this deficiency. By prioritizing resources towards important information, they improve task efficiency and accuracy, and are widely applied in time series prediction, natural language processing, and computer vision.
[0005] In the field of RUL prediction, attention mechanisms are often used as an auxiliary module to enhance the predictive ability of the model. Qin et al. proposed a macroscopic-microscopic attention mechanism to enhance the learning of important features and combined it with LSTM to achieve effective RUL prediction for gears and bearings (Y.Qin,S.Xiang,Y.Chai,H.Chen,Macroscopic-microscopic attentionin LSTM networks based on fusion features for gear remaining life prediction,IEEE Trans.Ind.Electron.67(12)(2020)).
[0006] 10865-10875, https: / / doi.org / 10.1109 / TIE.2019.2959492.). Niu et al. seamlessly integrated a novel attention gate into the GRU to form a novel recurrent attention unit (RAU), which was validated in areas such as sentiment classification (G.Zhong, G.Yue, X.Ling, Recurrent attention unit, 2018, arXiv:1810.12754.). Furthermore, Qin et al. designed a gated attention unit (GAU) to enhance the updating of hidden state information, and combined it with the attention gate in the RAU to integrate a gated dual attention unit (GDAU) to achieve high-precision prediction of rolling bearings (Y. Qin, D. Chen, S. Xiang, C. Zhu, Gated dual attention unit neural networks for remaining useful life prediction of rolling bearings, IEEE Trans. Ind. Informat. 17(9)(2021) 6438-6447, https: / / doi.org / 10.1109 / TIE.2019.2959492.). Although the above methods enhance the learning ability of temporal features by improving the network structure, they focus more on the current input information and shallow historical information, thus neglecting the in-depth mining of historical information. Summary of the Invention
[0007] To address the aforementioned technical problems, this invention proposes a rotating machinery RUL prediction method based on GTFDAU networks, comprising the following steps:
[0008] S1: Acquire vibration signals of rotating machinery and perform multi-dimensional time-frequency domain feature extraction and preprocessing;
[0009] S2: The Adam optimizer is selected, the minimum batch size is set to 64, the number of training iterations is set to 100, and the mean squared error is used as the loss function to train and optimize the DRSSN model. The HI curve is constructed on the preprocessed data using the trained and optimized DRSSN model.
[0010] S3: Construct the GTFDAU network, use the mean squared error as the loss function, train the GTFDAU network using the real-time recurrent learning algorithm, and obtain the trained GTFDAU network when the loss function is minimized.
[0011] S4: Use the trained GTFDAU network based on the HI curve to perform RUL prediction for rotating machinery.
[0012] The beneficial effects of this invention are:
[0013] This invention proposes a transient fluctuation capture mechanism and designs a novel gate structure, namely the transient fluctuation gate, which can be embedded in the network. By performing a differential operation on the hidden states of the previous two historical time steps, transient fluctuation information is effectively captured. In addition, by integrating the transient fluctuation gate and two attention gates, a novel GTFDAU network is designed, and the state transition update formula for time-series information is derived. This network can deeply mine transient fluctuation information and long-term overall information in historical information, and combined with the attention mechanism, it strengthens the adaptive learning and state update of historical and current information, thereby improving the long-term and short-term prediction capabilities and robustness of RUL. Attached Figure Description
[0014] Figure 1 This is a flowchart of a rotating machinery RUL prediction method based on a GTFDAU network according to the present invention;
[0015] Figure 2 The HI curve diagram of the rotating machinery constructed according to the present invention;
[0016] Figure 3 This is a schematic diagram of the unit structure of GTFDAU of the present invention. Detailed Implementation
[0017] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0018] A method for predicting the RUL of rotating machinery based on GTFDAU networks, such as Figure 1 As shown, it includes:
[0019] S1: Acquire vibration signals of rotating machinery and perform multi-dimensional time-frequency domain feature extraction and preprocessing;
[0020] S2: The Adam optimizer is selected, the minimum batch size is set to 64, the number of training iterations is set to 100, and the mean squared error is used as the loss function to train and optimize the DRSSN model. The HI curve is constructed on the preprocessed data using the trained and optimized DRSSN model.
[0021] S3: Construct the GTFDAU network, use the mean squared error as the loss function, train the GTFDAU network using the real-time recurrent learning algorithm, and obtain the trained GTFDAU network when the loss function is minimized.
[0022] S4: Use the trained GTFDAU network based on the HI curve to perform RUL prediction for rotating machinery.
[0023] First, the full-life-cycle vibration signals S of the rotating machinery under different operating conditions are collected using an accelerometer. The total number of samples collected is N, and the sampling duration and the interval between adjacent sampling points are T and T, respectively. Δ Twenty-one commonly used time-domain and frequency-domain features (13 time-domain features, 4 frequency-domain features, and 4 envelope spectrum features) are extracted from S to form a feature set, which is then labeled with HI and used to divide the dataset into training and testing sets. The feature set is represented as... Where e represents the number of elements in the first time series, and s represents the total number of feature sequences. Let q∈[1,s], then the HI label l of the q-th time series is shown in equation (1). To accelerate the convergence speed of the network, the training set and the test set are batch standardized and normalized according to equations (2) and (3) respectively, and the data size is reshaped according to the characteristics of the dataset to match the DRSSN input layer network.
[0024]
[0025]
[0026]
[0027] Where, x BN σ represents the batch-standardized data; μ represents the mean of the data sample; σ represents the standard deviation of the data sample; x NL x represents the magnitude of the normalized data. max and x min These represent the maximum and minimum values for each sampling period.
[0028] The preprocessed training set data is input into the DRSSN network, and the parameters are set as shown in the table below:
[0029] Table 1 DRSSN Network Parameter Settings
[0030]
[0031]
[0032] The network consists of one convolutional layer, four residual blocks, one batch normalization (BN) layer, one SELU activation layer, one global average pooling layer, and one fully connected layer. The formula (3, 2, 64) indicates that the kernel size is 3×3, the stride is 2, and the number of kernels is 64. The preprocessed time-frequency domain features are input into a convolutional layer for initial feature extraction. The convolutional kernel size is 3×3, the stride is 2, and the number of kernels is 64. After extraction by the convolutional layer, the features are input into four residual blocks for gradient preservation, which is used to alleviate the problems of gradient vanishing and gradient exploding caused by excessive model depth and improve the training efficiency of the network. After the residual blocks are output, the features are input into a Batch Normalization (BN) layer for normalization to improve the network's generalization ability. Then, a SELU activation layer is applied to the data output by the BN layer to perform nonlinear transformation, enriching the expressive power while avoiding overfitting. Then, a global average pooling layer is applied to achieve dimensionality reduction and regularization. Finally, a fully connected layer is used to flatten the output, thus obtaining a one-dimensional HI.
[0033] In addition, the model uses the Adam optimizer, with a minimum batch size of 64, a training iteration count of 100, and a loss function of mean squared error as shown in equation (4).
[0034]
[0035] in, Let y be the predicted value of the q-th sample. q Let be the true value of the q-th sample, and s be the total number of samples. The batch-standardized and normalized test set data is input into the trained DRSSN model, which is HI, represented as a one-dimensional vector V = [v1, v2, ..., v...]. N ] T Taking rolling bearings and fatigued gears as examples, the HI curve constructed in this invention is as follows: Figure 2 As shown.
[0036] The Hidden Information (HI) of rotating machinery typically consists of two parts: long-term overall information and transient fluctuation information. Long-term overall information exhibits good stability and monotonicity, reflecting the overall trend of HI degradation. Transient fluctuation information is often unstable and irregular, reflecting the transient trend of HI degradation. Overemphasizing long-term overall information while neglecting transient fluctuation information can lead to predictions that, while having a correct general direction, fail to adjust their direction due to the lack of timely learning of specific transient changes, resulting in deviations in the time it takes for the HI to reach the threshold. Conversely, overemphasizing transient fluctuation information while neglecting long-term overall information can capture local changes but lose the correct direction of degradation due to the lack of learning the overall trend, also causing prediction errors. Therefore, a reasonable strategy is needed to effectively learn these two types of information in HI. LSTM, GRU, and their variants are widely used in RUL prediction, where the hidden state acts as a bridge connecting the past and present, storing the network's learned information at the current moment.
[0037] Currently, two main problems hinder the in-depth mining of historical information in networks. First, while acquiring long-term overall information is relatively easy, acquiring transient fluctuation information is difficult. Second, the unit capacity of networks like GRU is limited and tends to saturate with increasing learning time. This relatively simple state transition pattern makes the network more inclined to retain long-term overall information, leaving no room to retain transient fluctuation information. Regarding the first problem, combining the patterns of historical changes, we found that although the difference between hidden states at adjacent time points does not show significant changes in the long-term overall trend, it can effectively characterize the transient fluctuation trend. Therefore, we use difference operations to capture transient fluctuation information, proposing a transient fluctuation capture mechanism. Regarding the second problem, since the network unit capacity is limited, we need a reasonable importance allocation strategy to adaptively learn the importance of the two types of information. Here, we utilize an attention mechanism to assign different weights to the two types of information, thereby making the information in the unit network more useful. Previously, we focused on mining and learning historical information. To combine it with the current input information, we use the attention gate in GAU to adaptively learn historical and current information, thereby better updating the hidden state at the current time point and providing effective information for subsequent predictions. Based on the above ideas, this study proposes a novel RUL prediction network called GTFDAU.
[0038] The unit structure of GTFDAU is as follows Figure 3 As shown in the figure, GTFDAU has three inputs: the current input information x. t The hidden state h of the first two historical moments t-1 and h t-2Aside from the basic reset and update gates, the network mainly consists of three novel "gate" structures. First, based on the proposed transient fluctuation capture mechanism, a transient fluctuation gate is constructed and embedded in the network. Second, the output TF of the transient fluctuation gate is... t That is, transient fluctuation information, and the hidden state h from the previous time step. t-1 By combining these approaches, a historical attention gate formed by an attention mechanism is input, enabling adaptive attention learning for historical information. Finally, the mined historical information Φ t With the current input information x t By combining these methods and using a current attention gate modified from the attention gate in GAU, adaptive attention learning based on historical and current information is achieved, thereby better updating the hidden state h at the current moment. t .
[0039] The derivation process of the state transition update formula for time-series information in GTFDAU is as follows:
[0040] First, the hidden state h at two historical moments... t-1 and h t-2 Perform differential operations to capture transient fluctuation information TF t :
[0041] TF t =h t-1 -h t-2
[0042] Historical attention to TF t and h t-1 Adaptive learning to mine historical information. t :
[0043] s(TF t ,h t-1 ) = V T tanh(W s TF t +U s h t-1 )
[0044]
[0045]
[0046] Where, Φ t Let V and W represent the historical information mined by the historical attention gate, s() represent the attention scoring function, and V and W represent the historical information mined by the historical attention gate. s U s These represent the first, second, and third matrix parameters in the historical attention gate, respectively, TF. t Represents transient fluctuation information, h t-1 γ represents the hidden state in the previous time step.t The expression represents the attention distribution; softmax represents the softmax function; exp represents the exponential function; and tanh represents the tanh function. These represent the parameters of the fourth and fifth matrices, respectively.
[0047] Combine historical information with current input information x t By combining the inputs of the reset gate and the update gate, we can obtain the outputs r of the reset gate and the update gate, respectively. t and z t ,include:
[0048] Reset gate output:
[0049] r t =σ(W r x t +U r Φ t +λ r )
[0050] Update gate output:
[0051] z t =σ(W z x t +U z Φ t +λ z )
[0052] Where, r t z t These represent the outputs of the reset gate and the update gate, respectively. t Indicates the current input information, Φ t The historical information mined by the historical attention gate is represented by σ(x), which represents the sigmoid function, and W... r U r W represents the first and second weight matrices of the reset gate, respectively. z U z Let λ represent the weight matrix of the update gate, respectively. r , λ z These represent the bias matrices for the reset gate and the update gate, respectively.
[0053] Historical Information Φ t Combined with the output r of the reset gate t With the current input information x t Generate a temporary hidden state include:
[0054]
[0055] in, Let δ(x) represent the temporary hidden state, tanh function, and W hU h Let x represent the first and second weight matrices respectively, used in the process of generating the temporary hidden state. t Indicates the current input information, r t Indicates the output of the reset gate, Φ t This indicates the historical information unearthed by the historical attention gate, λ h represents the bias matrix during the generation of temporary hidden states, and ⊙ represents the dot product.
[0056] Reset gate and update gate output r t and z t The input attention gate adaptively learns from historical and current information, and is based on the temporary hidden state. Update the current hidden state h t ,include:
[0057]
[0058]
[0059]
[0060]
[0061] in, This indicates the level of attention paid to the reset and update gate outputs. Indicates the positive or negative value of the reset and update gate outputs, Ψ t δ(x) represents the overall attention distribution that combines the attention levels of the reset and update gate outputs with the positive and negative cases. These represent the first, second, third, and fourth weight matrices in the adaptive learning process, respectively. Let r represent the first and second bias matrices in the adaptive learning process, respectively, ⊙ denote the dot product, and r t z t These represent the outputs of the reset gate and the update gate, respectively. t Indicates the current hidden state, Φ t Representing historical information, This indicates a temporarily hidden state.
[0062] Based on the updated current hidden state h t Output GTFDAU's output prediction results o t ,include:
[0063] o t =φ(W o h t +b o )
[0064] Among them, ot This represents the output prediction result of GTFDAU, where φ() represents the linear activation function, and W o and b o These represent the weights and bias matrices of the output layer, respectively.
[0065] To ensure the GTFDAU network exhibits good HI fitting and stability, mean square error (MSE) is chosen as its loss function.
[0066]
[0067] in, represents the true value of HI, and N represents the length of the output vector.
[0068] Compared to traditional GRUs and their variants, the proposed GTFDAU features three novel gate structures that enable both deep mining of historical information and adaptive learning of effective data, as well as secondary reinforcement learning by incorporating current input information. Therefore, GTFDAU is more suitable for predicting the RUL of rotating machinery.
[0069] A Hankel matrix is constructed based on the elements in the HI curve. The Hankel matrix is then input into the trained GTFDAU network for prediction at the first time step, resulting in the output of the first prediction point. The output of the first prediction time step is added to the Hankel matrix to update it. The updated matrix is then input into the GTFDAU network for prediction at the next time step. By repeating the above prediction process, the HI value at future time steps can be predicted step by step. When the predicted HI value exceeds the preset fault threshold, the RUL is obtained by multiplying the number of prediction points by the prediction time.
[0070] The Hankel matrix X can be constructed as follows:
[0071]
[0072] Where i is the number of neurons in the input layer, and X i It can be represented as:
[0073] X j =[v j v j+1 … v k-i+j+1 ]
[0074] GTFDAU is equivalent to a mapping function ξ, which takes the first i vectors in matrix X as input to the function, and the last vector X... i+1 This is considered as the output of the function. Therefore, the loss function of GTFDAU can be rewritten as:
[0075]
[0076] The model is trained using a real-time recurrent learning (RTRL) algorithm to ensure that the loss function L(X) is optimized. i+1 -ξ(X1,X2,…X i Minimize. After GTFDAU completes training, the output at the first prediction time can be expressed as:
[0077]
[0078] Simultaneously, due to the addition of new output, the Hankel matrix X is updated as follows:
[0079]
[0080] Repeating the above process allows for the stepwise prediction of HI, where the output at the m-th prediction time can be expressed as:
[0081]
[0082] Finally, when the predicted HI value exceeds the preset fault threshold, the RUL can be obtained by multiplying the number of prediction points p by the sampling time.
[0083]
[0084] Where p represents the number of prediction points, T represents the predicted duration. Δ This indicates the time interval between adjacent prediction points.
[0085] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A rotating machinery RUL prediction method based on GTFDAU network, characterized in that, include: S1: Acquire vibration signals of rotating machinery and perform multi-dimensional time-frequency domain feature extraction and preprocessing; S2: The Adam optimizer is selected, the minimum batch size is set to 64, the number of training iterations is set to 100, and the mean squared error is used as the loss function to train and optimize the DRSSN model. The HI curve is constructed on the preprocessed data using the trained and optimized DRSSN model. S3: Construct the GTFDAU network, use the mean squared error as the loss function, train the GTFDAU network using the real-time recurrent learning algorithm, and obtain the trained GTFDAU network when the loss function is minimized. The GTFDAU network includes: a transient fluctuation gate constructed based on a transient fluctuation capture mechanism, a historical attention gate formed by an attention mechanism, a current attention gate obtained by modifying the attention gate in GAU, a reset gate, and an update gate; The hidden state of the first two historical moments and Input transient fluctuation gates perform differential operations to capture transient fluctuation information. Transient fluctuation information Hidden state from the previous moment By combining input history attention gates with adaptive attention learning of historical information, we can mine historical information. Combine historical information with current input information By combining the input reset gate and update gate, in which the input information is multiplied by the weight matrix and summed with the bias, and then processed by the sigmoid function, the outputs of the reset gate and update gate are obtained respectively. and Meanwhile, historical information Combined with the output of the reset door With the currently input information Generate a temporary hidden state ; This will reset the door and update the door's output. and The input attention gate adaptively learns from historical and current information, and is based on the temporary hidden state. Update the current hidden state And hide the state according to the updated current time. Output prediction results of GTFDAU ; S4: Use the trained GTFDAU network to predict the RUL of rotating machinery based on the HI curve.
2. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Preprocessing of time-frequency domain features includes: The extracted time-domain and frequency-domain features are assigned HI labels, and training and test sets are divided. The feature data in the training and test sets are batch standardized and normalized, and the data size is reshaped according to the characteristics of the dataset to match the DRSSN input layer network.
3. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, The DRSSN model includes: one convolutional layer, four residual blocks, one batch normalization layer, one SELU activation layer, one global average pooling layer, and one fully connected layer. The preprocessed time-frequency domain features are input into a convolutional layer for initial feature extraction. After extraction by the convolutional layer, the features are input into four residual blocks for gradient preservation. After the residual blocks are output, the features are input into a batch normalization layer for normalization. The data output by the batch normalization layer is then subjected to nonlinear transformation by a SELU activation layer. Dimensionality reduction and regularization are achieved through a global average pooling layer. Finally, the output is flattened by a fully connected layer to obtain a one-dimensional HI curve.
4. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Transient fluctuation information Hidden state from the previous moment By combining input history attention gates with adaptive attention learning of historical information, we can mine historical information. ,include: in, This indicates the historical information unearthed through historical research. This represents the attention scoring function. , , These represent the parameters of the first, second, and third matrices in the historical attention gate, respectively. Indicates transient fluctuation information. This indicates the hidden state in the previous moment. This indicates the adaptive attention distribution learned. This represents the softmax function. Represents an exponential function. Represents the tanh function. , These represent the parameters of the fourth and fifth matrices, respectively. This indicates the matrix transpose.
5. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Compare historical information with current input information By combining the inputs of the reset gate and the update gate, we can obtain the outputs of the reset gate and the update gate, respectively. and ,include: Reset gate output: Update gate output: in, , These represent the outputs of the reset gate and the update gate, respectively. This indicates the currently input information. This indicates the historical information unearthed through historical research. This represents the sigmoid function. , These represent the first and second weight matrices of the reset gate, respectively. , These represent the weight matrices of the update gate, respectively. , These represent the bias matrices for the reset gate and the update gate, respectively.
6. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Historical Information Combined with the output of the reset door With the currently input information Generate a temporary hidden state ,include: in, Indicates a temporary hidden state. Represents the tanh function. , These represent the first and second weight matrices, respectively, during the generation of the temporary hidden state. This indicates the currently input information. This indicates that the output of the reset door is being reset. This indicates the historical information unearthed through historical research. This represents the bias matrix used in the process of generating the temporary hidden state. This represents the dot product.
7. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Reset the door and update the door output and The input attention gate adaptively learns from historical and current information, and is based on the temporary hidden state. Update the current hidden state ,include: in, This indicates the level of attention paid to the reset and update gate outputs. This indicates the positive or negative value of the output for the reset and update gates. This represents the overall attention distribution that integrates the attention given to the reset and update gate outputs with the positive and negative cases. Table tanh function, , , , These represent the first, second, third, and fourth weight matrices in the adaptive learning process, respectively. , These represent the first and second bias matrices in the adaptive learning process, respectively. Represents the dot product. , These represent the outputs of the reset gate and the update gate, respectively. Indicates the current hidden state. Representing historical information, This indicates a temporarily hidden state.
8. The rotating machinery RUL prediction method based on GTFDAU network according to claim 1, characterized in that, Based on the updated current hidden state Output prediction results of GTFDAU ,include: in, This represents the output prediction result of GTFDAU. Represents a linear activation function. and These represent the weights and bias matrices of the output layer, respectively.
9. The method for predicting the RUL of rotating machinery based on a GTFDAU network according to claim 1, characterized in that, Based on the HI curve, the RUL prediction of rotating machinery is achieved using the GTFDAU network, including: A Hankel matrix is constructed based on the elements in the HI curve. The Hankel matrix is then input into the trained GTFDAU network for prediction at the first time step, resulting in the output of the first prediction point. The output of the first prediction time step is added to the Hankel matrix to update it. The updated matrix is then input into the GTFDAU network for prediction at the next time step. This prediction process is repeated to achieve progressive prediction of HI at future time steps. When the predicted HI value exceeds the preset fault threshold, the RUL is obtained by multiplying the number of prediction points by the sampling time.