Device health prediction method and apparatus based on cluster center trajectory

CN121743786BActive Publication Date: 2026-06-23SHANDONG ENERGY DIGITAL CLOUD TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG ENERGY DIGITAL CLOUD TECH CO LTD
Filing Date
2026-02-27
Publication Date
2026-06-23

Smart Images

  • Figure CN121743786B_ABST
    Figure CN121743786B_ABST
Patent Text Reader

Abstract

The application provides a device health degree prediction method and device based on clustering center trajectory, the method comprises: performing adaptive noise reduction and feature enhancement based on multi-scale wavelet transform on the vibration data of the device running collected in real time to obtain a noise reduction and enhancement vibration data vector; calculating the similarity weight of the vector and the clustering center trajectory, dynamically adjusting the attention distribution, and determining the trajectory context vector; extracting the local multi-scale features of the noise reduction and enhancement vibration data vector through multi-branch hollow convolution, combining the sequence dependence modeling of the gating recurrent unit, and taking the trajectory context vector as the initial state of the gating recurrent unit, and outputting the perception-enhanced hidden state sequence of the health degree evolution; calculating the time sequence attention weight of the sequence and the clustering center trajectory, generating the sequence abstract vector perceived by the trajectory, and combining the multi-scale features to predict the health degree of the device through the full connection layer regression. The application can improve the prediction accuracy of the device health degree.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a method and apparatus for predicting equipment health based on cluster center trajectories. Background Technology

[0002] With the continuous improvement of the intelligence and automation level of industrial equipment, equipment health management has become an important task for ensuring production safety and improving efficiency. Equipment health prediction can identify potential faults in advance, reduce downtime, extend equipment life, and lower maintenance costs. Traditional equipment health prediction methods often rely on single features or local data based on statistical analysis, machine learning, or deep learning models, which usually cannot fully capture the complex temporal characteristics and multi-dimensional health evolution trajectory of equipment operation. In practical applications, equipment operating data often contains noise, non-stationarity, and long-term changes, making traditional models susceptible to interference from data fluctuations, resulting in inaccurate or lagging predictions. In addition, many existing technologies lack effective time-series modeling methods, cannot fully consider the dynamic process of equipment health changes over time, and are difficult to comprehensively assess the health status of equipment. Summary of the Invention

[0003] The purpose of this application is to provide a method and apparatus for predicting equipment health based on cluster center trajectories, so as to improve the accuracy of equipment health prediction.

[0004] Firstly, this application provides a method for predicting device health based on cluster center trajectories. The method includes: real-time acquisition of vibration data during device operation; adaptive denoising and feature enhancement of the vibration data based on multi-scale wavelet transform to obtain a denoised and enhanced vibration data vector; calculation of the similarity weight between the denoised and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model, dynamically adjusting the attention distribution, and determining the trajectory context vector; wherein, the cluster center trajectory is obtained by extracting the cluster center trajectory based on dynamic time warping from the denoised and enhanced vibration data vector in the training dataset; extracting local multi-scale features of the denoised and enhanced vibration data vector through multi-branch dilated convolution, modeling sequence dependencies by combining gated recurrent units, and using the trajectory context vector as the initial state of the gated recurrent unit to output a latent state sequence that enhances the perception of health evolution; calculating the temporal attention weights between the latent state sequence and the cluster center trajectory, generating a trajectory-aware sequence summary vector, and predicting the device health by regression through a fully connected layer in combination with multi-scale features.

[0005] Furthermore, the steps described above for performing adaptive denoising and feature enhancement on vibration data based on multi-scale wavelet transform to obtain a denoised and enhanced vibration data vector include: performing multi-scale wavelet decomposition and adaptive soft thresholding on the vibration data to obtain soft threshold wavelet coefficients; and performing wavelet reconstruction and feature enhancement weighted fusion on the soft threshold wavelet coefficients to obtain a denoised and enhanced vibration data vector.

[0006] Furthermore, the above-mentioned cluster center trajectories are obtained as follows: Based on the noise-reduced and enhanced vibration data vectors in the training dataset, the dynamic time warp distance between all sample pairs is calculated to form a dynamic time warp distance matrix; based on the dynamic time warp distance matrix, hierarchical clustering is performed to extract cluster centers, and the formation trajectories of the cluster centers are arranged in the order of equipment operation time to output a set of cluster center trajectories to characterize the typical evolution path of equipment health status.

[0007] Furthermore, the steps described above for calculating the similarity weights between the denoised and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model, dynamically adjusting the attention distribution, and determining the trajectory context vector include: calculating the cosine similarity between the denoised and enhanced vibration data vector and each cluster center, and calculating the attention weight of each cluster center by combining the average local signal-to-noise ratio and the dynamic time warping distance; based on the attention weight of each cluster center, performing a weighted summation of the cluster center trajectories, and adjusting it by combining the feature enhancement weights to generate a trajectory context vector that integrates global information of the health status trajectory.

[0008] Furthermore, the steps described above—extracting local multi-scale features from the denoised and enhanced vibration data vector through multi-branch dilated convolution, modeling sequence dependencies using gated recurrent units, and using the trajectory context vector as the initial state of the gated recurrent units to output a latent state sequence with enhanced perception of health evolution—include: applying multi-branch dilated convolution to the denoised and enhanced vibration data vector, adjusting the threshold mean and standard deviation to adapt to vibration features at different scales, and outputting multi-scale feature maps capturing vibration patterns under different receptive fields; wherein each branch uses convolution kernels with different dilation rates; stitching the multi-scale feature maps along the channel dimension to generate a stitched multi-scale feature map that integrates multi-scale spatiotemporal features to integrate local details and global trend information at different scales; initializing the latent state with the linearly transformed trajectory context vector, introducing feature enhancement weights and trajectory context vectors into the gated recurrent units, and dynamically adjusting the processing intensity of historical information by the update gate and reset gate to achieve the modeling of long-term dependencies in the vibration sequence and the fusion of prior knowledge of health state trajectories, outputting a latent state sequence with enhanced perception of health evolution.

[0009] Furthermore, the steps described above for calculating the temporal attention weights of the hidden state sequence and the cluster center trajectories, generating a trajectory-aware sequence summary vector, and combining multi-scale features to predict the device's health through regression in a fully connected layer include: calculating the cosine similarity between the hidden state at each time step and all cluster centers, and calculating the temporal attention weights by combining the inverse of the dynamic time-warped distance; performing a weighted summation of the cluster center trajectories based on the temporal attention weights, and fusing them with the hidden state sequence through a gating mechanism to generate a trajectory-enhanced sequence summary vector to capture the health status evolution information of the entire time series; concatenating the sequence summary vector with the max-pooling result of the multi-scale convolutional feature map, and predicting the health through regression in a fully connected layer to output the predicted health of the device.

[0010] Furthermore, the model loss value of the above model is calculated as follows: the health distance score is calculated based on the minimum dynamic time regularization distance between the sample and all cluster centers; the weighted mean square error loss is calculated based on the health distance score and the sample prediction error to obtain the model loss value.

[0011] Secondly, this application also provides a device for predicting device health based on cluster center trajectories. The device includes: a data acquisition module for real-time acquisition of vibration data during device operation; a noise reduction and feature enhancement module for adaptive noise reduction and feature enhancement of the vibration data based on multi-scale wavelet transform to obtain a noise-reduced and enhanced vibration data vector; a context vector determination module for calculating the similarity weight between the noise-reduced and enhanced vibration data vector and the cluster center trajectory stored in the pre-trained model, dynamically adjusting the attention distribution, and determining the trajectory context vector; wherein, the cluster center trajectory is obtained by extracting the noise-reduced and enhanced vibration data vector in the training dataset based on dynamic time warping; a hidden state sequence output module for extracting local multi-scale features of the noise-reduced and enhanced vibration data vector through multi-branch dilated convolution, modeling sequence dependencies by combining gated recurrent units, and using the trajectory context vector as the initial state of the gated recurrent unit to output a hidden state sequence with enhanced perception of health evolution; and a health prediction module for calculating the temporal attention weight between the hidden state sequence and the cluster center trajectory, generating a trajectory-aware sequence summary vector, and predicting the device health by regression through a fully connected layer in combination with multi-scale features.

[0012] Thirdly, this application also provides an electronic device, including a processor and a memory, wherein the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the method described in the first aspect above.

[0013] Fourthly, this application also provides a computer-readable storage medium storing computer-executable instructions, which, when invoked and executed by a processor, cause the processor to implement the method described in the first aspect above.

[0014] The device health prediction method and apparatus based on cluster center trajectories provided in this application employ dynamic time warping technology to calculate the similarity between samples and extract cluster center trajectories through hierarchical clustering, solving the problem that traditional methods cannot capture the dynamic path of health evolution. Adaptive denoising and feature enhancement are performed based on multi-scale wavelet transform, and adaptive thresholds are calculated through local signal-to-noise ratio, avoiding excessive smoothing of fault features by traditional denoising methods. A trajectory-aware attention mechanism is introduced, enabling the model to dynamically focus on historical health evolution paths related to the current health status, improving the ability to perceive time series. A gated recurrent network combining multi-scale convolution and trajectory enhancement is used to capture local details through dilated convolution and incorporate global trajectory information as prior knowledge into the recurrent neural network, enhancing the long-term dependence of health prediction. Attached Figure Description

[0015] To more clearly illustrate the technical solutions in the specific embodiments of this application or the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0016] Figure 1 A flowchart illustrating a device health prediction method based on cluster center trajectories provided in this application embodiment;

[0017] Figure 2 A comparative diagram of different methods for predicting trends provided in an embodiment of this application;

[0018] Figure 3 A multi-dimensional performance radar comparison chart of different methods provided in the embodiments of this application;

[0019] Figure 4 A schematic diagram comparing the prediction error distribution of different methods provided in an embodiment of this application;

[0020] Figure 5 A schematic diagram showing the overall performance ranking comparison of different methods provided for embodiments of this application;

[0021] Figure 6 A structural block diagram of a device health prediction device based on cluster center trajectory provided in this application embodiment;

[0022] Figure 7 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0023] The technical solutions of this application will be clearly and completely described below with reference to the embodiments. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0024] The shortcomings of existing technologies are as follows:

[0025] 1. Traditional methods generally rely on fixed threshold noise reduction or simple filtering techniques, which cannot fully distinguish between noise and fault characteristics, resulting in large prediction errors.

[0026] 2. Most existing technologies ignore the temporal continuity of equipment health evolution and usually make predictions through classification or regression of independent samples, lacking modeling of dynamic evolution over time series.

[0027] 3. Traditional health prediction models often rely on single features or local information, failing to combine global health trajectories with local fault characteristics, making it difficult to comprehensively assess the operating status of equipment.

[0028] 4. Existing technologies fail to effectively address the imbalance of health categories, resulting in weak predictive ability of the model for a few fault states.

[0029] Based on this, embodiments of this application provide a method and apparatus for predicting equipment health status based on cluster center trajectories. This method can efficiently extract health status evolution trajectories from complex vibration data, thereby improving the accuracy of equipment health status prediction. It also improves the prediction sensitivity of fault states, especially when equipment is about to fail, enabling early detection and prediction. By using dynamic time warping and multi-scale feature fusion, it reduces the processing difficulty of complex time-series data by traditional methods and avoids information loss. Furthermore, it excels in solving the class imbalance problem, effectively improving the prediction accuracy of fault samples by employing a weighted loss function based on trajectory distance.

[0030] To facilitate understanding of this embodiment, a detailed description of a device health prediction method based on cluster center trajectory disclosed in this application embodiment will be provided first.

[0031] Figure 1 A flowchart of a device health prediction method based on cluster center trajectory provided in this application embodiment is included, and the method specifically includes the following steps:

[0032] Step S102: Real-time acquisition of vibration data during equipment operation;

[0033] Vibration data can be acquired by vibration sensors installed on key components of the equipment (such as bearings or gearboxes). The sensors continuously acquire vibration signals during equipment operation at a fixed sampling frequency. One sample can collect 8192 data points to form a high-dimensional time series vector.

[0034] Step S104: Perform adaptive denoising and feature enhancement on the vibration data based on multi-scale wavelet transform to obtain a denoised and enhanced vibration data vector.

[0035] The high-dimensional vectors acquired by vibration sensors contain both high-frequency noise and low-frequency fault features, and the vibration signal exhibits non-stationary characteristics in the frequency domain. Conventional processing methods employ fixed-threshold wavelet denoising or low-pass filtering techniques. However, since the fault features overlap with the noise frequency bands, a fixed threshold may result in over-smoothing of useful high-frequency components or excessive retention of noise, thus causing feature distortion.

[0036] In this embodiment, wavelet coefficients at different scales are obtained through multi-scale wavelet transform, and an adaptive threshold is calculated based on the local signal-to-noise ratio. A soft thresholding function is applied to process the wavelet coefficients to suppress noise. Simultaneously, the fault-sensitive frequency band is highlighted through wavelet coefficient reconstruction and feature enhancement modules, thereby achieving effective separation of noise and fault features. The specific implementation process will be detailed later.

[0037] Step S106: Calculate the similarity weight between the denoised and enhanced vibration data vector and the cluster center trajectory stored in the pre-trained model, dynamically adjust the attention distribution, and determine the trajectory context vector; wherein, the cluster center trajectory is obtained by extracting the cluster center trajectory based on dynamic time warping from the denoised and enhanced vibration data vector in the training dataset.

[0038] Noise reduction and enhanced vibration data vectors contain time-series patterns of equipment health status evolution. However, conventional clustering methods such as K-means assume that the data are independent and identically distributed, which cannot capture the dynamic evolution trajectory of vibration patterns in the time dimension, causing health assessment to ignore the continuity of state transitions.

[0039] This embodiment calculates the similarity between samples through dynamic time warping, extracts cluster centers by combining hierarchical clustering, and constructs cluster center trajectories based on time sequence, thereby characterizing the typical evolution path of equipment health status.

[0040] Denoising and enhancing vibration data vectors requires combining them with cluster center trajectories to capture health evolution patterns. However, conventional neural network attention mechanisms rely solely on input data and do not incorporate external knowledge, potentially ignoring the guiding role of global health state trajectories.

[0041] This embodiment dynamically adjusts the attention distribution by calculating the similarity weight between the denoised and enhanced vibration data vector and the cluster center trajectory, so that the model focuses on the trajectory point most relevant to the current state. The specific implementation process will be detailed later.

[0042] Step S108: Extract local multi-scale features of the noise-reduced and enhanced vibration data vector through multi-branch dilated convolution, combine it with gated recurrent unit to model sequence dependency, and use the trajectory context vector as the initial state of the gated recurrent unit to output a hidden state sequence that enhances the perception of health evolution.

[0043] Vibration data has multi-scale spatiotemporal characteristics and long-term dependencies, but conventional deep neural networks use fixed-scale convolutions or independent recurrent layers, which cannot simultaneously capture local high-frequency vibration patterns and global state evolution trends, and do not integrate trajectory context information.

[0044] This embodiment extracts local multi-scale features through multi-branch dilated convolution, models sequence dependencies using gated recurrent units, and introduces trajectory context vectors as the initial state of the gated recurrent units, thereby enhancing the model's perception of health status evolution. The specific implementation process will be detailed later.

[0045] Step S110: Calculate the temporal attention weights of the hidden state sequence and the cluster center trajectory, generate a trajectory-aware sequence summary vector, and combine multi-scale features to predict the health of the device through regression in a fully connected layer.

[0046] Gated recurrent networks with multi-scale convolution and trajectory enhancement output a sequence of hidden states. However, health prediction requires the integration of information from the entire sequence and the global evolution pattern of the cluster center trajectory. Conventional methods directly use the last hidden state or average pooling, but ignore the dynamic prior of the health state trajectory and do not make full use of the local details of multi-scale features, which may lead to prediction bias.

[0047] This embodiment generates a trajectory-aware sequence summary vector by calculating the temporal attention weights between the hidden state sequence and the cluster center trajectory, and then predicts health status through regression using a fully connected layer by combining multi-scale features. The specific implementation process will be detailed later.

[0048] The specific implementation process of each of the above steps is described in detail below:

[0049] Step S104 above involves performing adaptive denoising and feature enhancement on the vibration data based on multi-scale wavelet transform to obtain a denoised and enhanced vibration data vector, specifically including:

[0050] (1) Multi-scale wavelet decomposition and adaptive soft thresholding denoising are performed on the vibration data to obtain soft threshold wavelet coefficients;

[0051] In practice, the vibration data vector corresponding to the vibration data is subjected to multi-scale wavelet decomposition to obtain wavelet coefficients at different scales. Then, an adaptive threshold is calculated based on the local signal-to-noise ratio at each scale, and the wavelet coefficients are processed by a soft thresholding function. This process suppresses noise while preserving fault characteristics, as shown below:

[0052] ;

[0053] In the formula, This represents the soft-threshold wavelet coefficients, which suppress noise by shrinking the absolute value of the wavelet coefficients while preserving the sign to avoid distortion; This function represents the sign function; it outputs 1 when the input is greater than 0, -1 when the input is less than 0, and 0 when the input is equal to 0. This represents the function that takes the maximum value. Indicates the first The first scale, the first The wavelet coefficients at each position are used to capture the local characteristics of vibration data in different frequency bands, and are calculated as follows: ; Indicates the first An adaptive threshold for each scale, dynamically adjusted based on the noise standard deviation and local signal-to-noise ratio at that scale, is used for soft thresholding. The calculation method is expressed as follows: ; This is the wavelet decomposition scale index, with a value range of [value range missing]. ; This represents the maximum decomposition scale, with a preferred value of 8. This is a time variable, representing the index of consecutive time points, with a value range of [value range missing]. ; This is the wavelet coefficient position index, with a value range of [value range missing]. ; Indicates the first The number of coefficients in the scale; The first element representing the original vibration data vector The data points are from the raw time-series data collected by the vibration sensor, and are characterized by non-stationarity and the presence of high-frequency noise and low-frequency fault features. Indicates the first A scaling factor of several scales is used to control the scaling of the wavelet basis functions, and the value sequence is as follows: ; This represents the wavelet basis functions used for multi-scale decomposition, using the Daubechies wavelet family as the basis functions; Represents the time variable Integrals from negative infinity to positive infinity; Indicates the first The number of wavelet coefficients at the scale is used for normalization in threshold calculation; This represents a smoothing parameter used to control the degree to which the signal-to-noise ratio adjusts the threshold; a value of 0.5 is preferred. Indicates the first The local signal-to-noise ratio (SNR) at the scale is calculated as follows: when the SNR is high, the threshold is lowered to retain more fault features; when the SNR is low, the threshold is raised to suppress noise. ; For the first The noise mean difference of the scaled wavelet coefficients is estimated by the median absolute deviation; For the first The noise standard deviation of the scale wavelet coefficients is estimated by the median absolute deviation; Represents the natural exponential function; Represent the natural logarithm function; It represents the absolute value.

[0054] It should be noted that the Daubechies wavelet is a compactly supported orthogonal wavelet with high regularity and vanishing moment, which can effectively capture local features of the signal and is especially suitable for processing non-stationary vibration signals. Common models include db4 or db8.

[0055] (2) Wavelet reconstruction and feature enhancement weighted fusion of soft threshold wavelet coefficients are performed to obtain noise-reduced and enhanced vibration data vector.

[0056] In practice, the wavelet coefficients after thresholding (i.e., soft-threshold wavelet coefficients) are used to reconstruct the vibration data through inverse wavelet transform. Then, feature enhancement weights are calculated based on the reconstruction error, and the original data and reconstructed data are weighted and superimposed to highlight the fault-sensitive frequency band. The noise-reduced and enhanced vibration data vector is output, represented as follows:

[0057] ;

[0058] In the formula, This represents a noise-reduced and enhanced vibration data vector with a dimension of 8192. It characterizes the enhanced vibration signal, highlights the fault-sensitive frequency band, and avoids excessive smoothing of useful features during the noise reduction process. This represents a denoised vibration data vector with a dimension of 8192, symbolizing the fundamental components of a pure vibration signal. The denoised vibration data vector is... The data points are The calculation method is expressed as ; This represents the feature enhancement weight vector, with a dimension of 8192. The [missing information] is the [missing information]th ... Each element is ω(t), representing the weight of the t-th data point, used to emphasize regions with significant residual noise. The calculation method is expressed as follows: ; This represents the enhancement intensity coefficient, used to control the contribution of the original data to the enhancement, with a preferred value of 0.3; This represents element-wise multiplication; This represents the original vibration data vector, and the first element of the original vibration data vector is... The data points are ; To distinguish it from the time variable t, it represents the index of consecutive time points, with a value range of [value range missing]. ; This represents the t-th data point in the original vibration data vector. This represents the i-th data point in the original vibration data vector; The first vector representing the noise-reduced vibration data vector Data points, The first vector representing the noise-reduced vibration data vector Data points.

[0059] Furthermore, the above-mentioned cluster center trajectories are obtained as follows:

[0060] (1) Based on the noise-reduced and enhanced vibration data vectors in the training dataset, calculate the dynamic time warping distance between all sample pairs and form a dynamic time warping distance matrix;

[0061] In practice, the dynamic time warping distance between all sample pairs is calculated based on the noise-reduced and enhanced vibration data vectors. The time series are aligned by finding the optimal warping path, the morphological differences of vibration modes in the time dimension are quantified, and a similarity matrix is ​​constructed for subsequent clustering analysis, represented as:

[0062] ;

[0063] In the formula, Indicates sample and The dynamic time warp distance between the two vibration sequences is used to measure the morphological similarity of the two vibration sequences on the time axis; Indicates the first A noise-reduced and enhanced vibration data vector is a noise-reduced and enhanced vibration data vector. Regarding the first The calculation results for each sample; Indicates the first The noise reduction and vibration enhancement data vector of the first... One data point; Indicates the first A noise-reduced and enhanced vibration data vector is a noise-reduced and enhanced vibration data vector. Regarding the first The calculation results for each sample; Indicates the first The noise reduction and vibration enhancement data vector of the first... One data point; This is the sample index, and its value range is... ; To distinguish it from the sample index of 'a', the value range is... ; Indicates the total number of samples; A regularized path is a sequence of time-indexed pairs that satisfies boundary conditions, continuity, and monotonicity, and is used to find the optimal alignment. This indicates taking the minimum normalized path, used to find the optimal time alignment path; Represents a regular path Time index pairs in the data; express and The absolute distance between point pairs is used to quantify the difference between two data points, and is calculated as follows: .

[0064] (2) Based on the dynamic time-normalized distance matrix, hierarchical clustering is performed to extract cluster centers, and the formation trajectory of the cluster centers is arranged in the order of equipment operation time. The set of cluster center trajectories is output to represent the typical evolution path of equipment health status.

[0065] In practical implementation, hierarchical clustering is performed based on a dynamic time-warped distance matrix to extract cluster centers. The cluster center formation trajectories are then arranged according to the equipment's operating time sequence, and a set of cluster center trajectories is output to characterize the typical evolution path of the equipment's health status, represented as follows:

[0066] ;

[0067] In the formula, The cluster center trajectory represents a sequence of cluster centers arranged in chronological order, with dimension 1. Characterizes the evolution path of health status; This is a clustering index, with a value range of [value range missing]. ; This indicates the number of clusters, used to control the granularity of the trajectories; a setting of 5 is preferred. Indicates the first The vector of each cluster center is obtained by analyzing the sample set. The average of all samples is obtained. Indicates the first The vector of cluster centers The vector representing the first cluster center. The vector representing the second cluster center; Indicates belonging to the first A set of samples for each cluster.

[0068] Step S106 above, which calculates the similarity weights between the noise-reduced and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model, dynamically adjusts the attention distribution, and determines the trajectory context vector, specifically includes:

[0069] (1) Calculate the cosine similarity between the noise-reduced and enhanced vibration data vector and each cluster center, and combine the average local signal-to-noise ratio and dynamic time warping distance to calculate the attention weight of each cluster center;

[0070] In practice, the cosine similarity between the denoised and enhanced vibration data vector and each cluster center is calculated. Attention weights are then calculated by combining the average local signal-to-noise ratio and the dynamic time warping distance to achieve attention allocation for trajectory perception. This allows the model to focus on the trajectory points most relevant to the current state, expressed as:

[0071] ;

[0072] In the formula, Represents a noise-reduced and enhanced vibration data vector With the The cosine similarity of the cluster centers is used to measure the degree of matching between the current vibration mode and the trajectory points, and is calculated as follows: ; Represents the L2 norm; Represents a noise-reduced and enhanced vibration data vector With the Vectors of cluster centers The dynamic time warp distance; Represents a noise-reduced and enhanced vibration data vector With the Vectors of cluster centers The dynamic time warp distance; This represents the multi-feature fusion coefficient, used to balance the similarity index, with a preferred value of 0.2. The average local signal-to-noise ratio is obtained from multi-scale wavelet decomposition and is calculated as follows: This is used to enhance the weight of the fault band; Indicates the first The local signal-to-noise ratio at each scale is calculated in the multi-scale wavelet decomposition and is expressed as follows: This is used to quantify the signal-to-noise ratio at this scale, thereby adjusting the adaptive threshold; Indicates the first A vector of cluster centers; Indicates the first The attention weight of each cluster center is such that the larger the weight, the more important the trajectory point is to the representation of the current state. Indicates the relationship between the input vector and the first... The cosine similarity of the cluster centers is used to calculate the attention weights; This represents the smoothness parameter, used to control the smoothness of the weight distribution; a value of 0.1 is preferred. To distinguish from The clustering index has a value range of 100. ; This represents the natural exponential function.

[0073] (2) Based on the attention weight of each cluster center, the cluster center trajectory is weighted and summed, and then adjusted in combination with the feature enhancement weight to generate a trajectory context vector that integrates global information of the health status trajectory.

[0074] In practice, the cluster center trajectories are weighted and summed based on attention weights, and then adjusted using feature enhancement weights to generate trajectory context vectors as supplementary features of the input data. This incorporates global information from the health status trajectories, and is represented as follows:

[0075] ;

[0076] In the formula, This represents the trajectory context vector, with a dimension of 8192, which integrates global information from the health status trajectory; This represents the enhancement intensity coefficient, with a preferred value of 0.3, which is used to control the adjustment intensity of the feature enhancement weights on the trajectory context vector.

[0077] Step S108 above extracts local multi-scale features of the noise-reduced and enhanced vibration data vector through multi-branch dilated convolution, combines it with gated recurrent units to model sequence dependencies, and uses the trajectory context vector as the initial state of the gated recurrent unit to output a hidden state sequence that enhances the perception of health evolution. Specifically, it includes:

[0078] (1) Apply multi-branch dilated convolution processing to the noise reduction and enhancement vibration data vector, and adapt the vibration features of different scales by adjusting the threshold mean and standard deviation, and output multi-scale feature maps that capture vibration patterns under different receptive fields; wherein, each branch uses convolution kernels with different dilation rates.

[0079] In practice, multi-branch dilated convolution is applied to the noise-reduced and enhanced vibration data vector. Each branch uses a convolution kernel with a different dilation rate. By adjusting the threshold mean and standard deviation, the system adapts to vibration features at different scales, outputting a multi-scale feature map that captures vibration patterns under different receptive fields, as shown below:

[0080] ;

[0081] In the formula, Indicates the first The output feature maps of each branch, with a dimension of 8192×32, are used to preserve vibration features at a specific scale. This is a branch index, and its value range is... ; Indicates the first The average threshold of each branch is a trainable parameter used to adjust the weight distribution of the dilated convolution to adapt to vibration features at different scales. Indicates the first The threshold standard deviation of each branch is a trainable parameter used to adjust the weight distribution of the dilated convolution to adapt to vibration features at different scales. This represents element-wise multiplication; This indicates the number of branches, used to control the richness of multi-scale features; a setting of 4 is preferred. Indicates the first The convolutional kernel weight matrix of each branch is a trainable parameter; Indicates the first The bias terms of each branch are trainable parameters; Indicates the void ratio The dilated convolution operation is used to expand the receptive field without increasing the number of parameters; Indicates the first The void ratio of each branch, with a value sequence as follows: This is used to control the degree of expansion of the convolution kernel; This represents the modified linear unit activation function.

[0082] (2) The multi-scale feature maps are spliced ​​in the channel dimension to generate a spliced ​​multi-scale feature map that integrates multi-scale spatiotemporal features, so as to integrate local details and global trend information at different scales;

[0083] In practice, the feature maps extracted by multi-branch dilated convolution are concatenated along the channel dimension to generate a joint feature map that integrates multi-scale spatiotemporal features, combining local details and global trend information at different scales, represented as:

[0084] ;

[0085] In the formula, This represents a spliced ​​multi-scale feature map with dimensions of 8192×128, which characterizes the vibrational pattern that integrates local details and global trends, and has multi-scale spatiotemporal features. This represents the output feature map of the first branch, with dimensions of 8192×32, corresponding to the hole rate of the first branch. =1 dilated convolution result; This represents the output feature map of the second branch, with dimensions of 8192×32, corresponding to the hole rate of the second branch. =2 dilated convolution result; This represents the output feature map of the third branch, with dimensions of 8192×32, corresponding to the hole rate of the third branch. =4 dilated convolution result; This represents the output feature map of the 4th branch, with dimensions of 8192×32, corresponding to the hole rate of the 4th branch. =8 dilated convolution result; This indicates a splicing operation.

[0086] (3) Initialize the hidden state with the trajectory context vector after linear transformation, introduce feature enhancement weights and trajectory context vectors in the gated loop unit, and realize the modeling of the long-term dependency relationship of the vibration sequence and the fusion of the prior knowledge of the health state trajectory by dynamically adjusting the processing intensity of the update gate and the reset gate on historical information, and output the hidden state sequence with enhanced perception of health evolution.

[0087] In practical implementation, the hidden state is initialized with the trajectory context vector after linear transformation. Feature enhancement weights and the trajectory context vector are introduced into the gated recurrent unit. By dynamically adjusting the processing intensity of historical information by the update gate and the reset gate, the modeling of the long-term dependency relationship of the vibration sequence and the fusion of prior knowledge of the healthy state trajectory are realized, as expressed as:

[0088] ;

[0089] In the formula, Indicates the first The hidden state vector at each time step has a dimension of 256 and is used to store sequence information. Indicates the first The hidden state vector at each time step At that time, the initial hidden state vector ; This represents the trajectory fusion coefficient, with a preferred value of 0.1, which is used to control the weight of the trajectory context vector in the hidden state update. This represents a linear transformation layer, which will transform the trajectory context vector. From 8192-dimensional projection to 256-dimensional projection; Indicates the first The candidate hidden states at each time step are used to calculate new information at the current time step, and the calculation method is represented as follows: ; Indicates the first The update gates at each time step are used to control the degree to which historical information is retained, and are calculated as follows: ; This represents element-wise multiplication; This is the time step index, and its value range is... ; The weight matrix representing the candidate hidden state is a trainable parameter; The weight matrix of the update gate is a trainable parameter; The weight matrix representing the reset gate is a trainable parameter. Indicates the first A reset gate at each time step is used to control the degree of forgetting of historical information, and its calculation method is expressed as follows: ; Represents the splicing of multi-scale feature maps The Middle The feature vectors at each time step have a dimension of 128; This represents the Sigmoid activation function; This represents the bias term for updating the gate, which is a trainable parameter; The bias term representing the reset gate is a trainable parameter; The bias term representing the candidate hidden state is a trainable parameter; This represents the hyperbolic tangent activation function.

[0090] In one implementation, the fully connected layer has 8192 input neurons and 256 output neurons, and uses a linear activation function to convert the trajectory context vector... Projecting from 8192 dimensions to 256 dimensions.

[0091] Step S110 above calculates the temporal attention weights of the hidden state sequence and the cluster center trajectory, generates a trajectory-aware sequence summary vector, and combines multi-scale features to predict the device's health through a fully connected layer regression. Specifically, this includes:

[0092] (1) Calculate the cosine similarity between the hidden state at each time step and all cluster centers, and combine it with the inverse of the dynamic time regularization distance to calculate the temporal attention weight;

[0093] In practice, the cosine similarity between the hidden state at each time step and all cluster centers is calculated, and the attention weight is calculated by combining the inverse of the dynamic time-warped distance. Softmax normalization is then used to focus the model on the trajectory point most relevant to the current hidden state, as shown below:

[0094] ;

[0095] In the formula, Indicates the first The hidden state at the time step and the _th time step The attention weight of each cluster center indicates that the trajectory point is more important to the representation of the current time step. Indicates the first The vectors of the cluster centers, with a dimension of 8192, are derived from the cluster center trajectories. ; Indicates the first The hidden state vector at each time step With the Vectors of cluster centers The dynamic time-warped distance between them, due to their different dimensions, needs to be adjusted by the first... The hidden state vector at each time step The calculation is performed after projecting the linear transformation to 8192 dimensions; Indicates the first The hidden state vector at each time step With the Vectors of cluster centers The dynamic time-normalized distance between them; This represents the fusion coefficient, used to balance the contributions of cosine similarity and dynamic time-warped distance, with a preferred value of 0.1.

[0096] (2) The cluster center trajectory is weighted and summed based on the temporal attention weight, and then fused with the hidden state sequence through a gating mechanism to generate a trajectory-enhanced sequence summary vector to capture the health status evolution information of the entire time series;

[0097] In practice, the cluster center trajectories are weighted and summed based on attention weights, and then fused with the latent state sequence through a gating mechanism to generate a trajectory-enhanced sequence summary vector, capturing the health status evolution information of the entire time series, represented as:

[0098] ;

[0099] In the formula, The sequence summary vector representing the trajectory enhancement has a dimension of 256 and is used to characterize the health status of the entire vibration sequence; Represents a linear transformation layer, which will be the first... Vectors of cluster centers From 8192-dimensional projection to 256-dimensional projection; Indicates the first The fusion gate value at each time step is used to balance the contributions of the hidden state and the trajectory center, and is calculated as follows: ; It is a fusion gated weight matrix, which consists of trainable parameters; It is a fusion-gated bias term, which is a trainable parameter; Indicates the first The hidden state vector at each time step With trajectory context vector Then, the parts are assembled.

[0100] (3) The sequence summary vector is concatenated with the max pooling result of the multi-scale convolutional feature map, and the health is predicted by regression through a fully connected layer. The predicted health of the device is then output.

[0101] In practice, the sequence summary vector is concatenated with the max-pooling result of the multi-scale convolutional feature map, and the health score is predicted through regression using a fully connected layer. The predicted health score of the i-th sample is output as follows:

[0102] ;

[0103] In the formula, Indicates the first The predicted health score for each sample, with values ​​ranging from [value range missing]. ; This represents the weight matrix of the fully connected layer, which are trainable parameters; This represents the bias term of the fully connected layer, which is a trainable parameter; Represents a sequence summary vector that enhances the trajectory. With max pooling vector To splice; The max-pooling vector represents a multi-scale feature map. The max pooling result along the time dimension has a dimension of 256.

[0104] Equipment health prediction faces the problem of class imbalance, with a large number of healthy samples and a small number of faulty samples. Conventional loss functions such as mean squared error loss function treat all samples equally, resulting in the model being insensitive to fault states.

[0105] This embodiment calculates a weighted loss function by measuring the distance between a sample and the cluster center trajectory, emphasizing samples that are close to a faulty state, thereby improving the model's ability to detect a decline in health. The specific steps are as follows:

[0106] 1) Calculate the health distance score based on the minimum dynamic time-normalized distance between the sample and all cluster centers, i.e., the health distance score calculation process:

[0107] A health distance score is calculated based on the minimum dynamic time-normalized distance between the sample and all cluster centers. This score quantifies the deviation of the current state from a typical healthy state and is expressed as:

[0108] ;

[0109] In the formula, Represents the trajectory context vector The minimum dynamic time-normalized distance to all cluster centers is used to measure the similarity to the nearest trajectory point, and is calculated as follows: ; This means calculating the dynamic time-warped distance between the input vector and all cluster centers, and taking the minimum value to find the distance between the sample and the nearest cluster center, thereby quantifying the degree of deviation from the health status. Represents a noise-reduced and enhanced vibration data vector With the Vectors of cluster centers The dynamic time warp distance; This represents the health distance score, with a value range of [value missing]. The smaller the value, the worse the health. This represents the set of minimum dynamic time-normalized distances for all samples, used to normalize health distance scores; Represents the set of minimum dynamic time-warped distances for all samples. The maximum value in.

[0110] 2) Calculate the weighted mean squared error loss based on the health distance score and sample prediction error to obtain the model loss value, i.e., the adaptive weighted mean squared error loss calculation process:

[0111] The weighted mean squared error loss is calculated by combining the health distance score and prediction error. Samples with poor health are assigned higher weights, and the imbalance adjustment intensity is controlled by a weight scaling factor to improve the model's sensitivity to fault states. This is expressed as:

[0112] ;

[0113] In the formula, This represents an adaptive weighted loss function that dynamically adjusts the loss weight of each sample based on the health distance score, assigning higher weights to samples with poor health, thereby alleviating the class imbalance problem and improving the model's sensitivity to fault states. Indicates the batch sample quantity; Indicates the first The true health label for each sample, with a value range of [value range missing]. ; Indicates the first Predicted health of each sample; Indicates the first The loss weight for each sample, calculated based on its health distance score, is expressed as follows: ; This represents the weight scaling factor, used to control the intensity of imbalance adjustment; a value of 2.0 is preferred. Indicates the first The health distance score for each sample is the health distance score. Regarding the first The calculation results for each sample.

[0114] The following describes the training and parameter update process of the equipment health prediction model:

[0115] (1) Description of the dataset:

[0116] During the construction of the training dataset, a large number of vibration data samples were collected from multiple devices at different operating stages (including health, decay and failure states) to ensure that the data covers the entire life cycle of the devices.

[0117] Furthermore, the collected data is labeled. The data labeling is based on the equipment's historical maintenance records and operating status logs. Domain experts assign a health label to each sample based on actual failure events and performance indicators. The label value is a continuous value between 0 and 1, where 1 represents that the equipment is completely healthy, 0 represents that the equipment is completely failed, and intermediate values ​​correspond to different degrees of performance degradation (such as mild wear, moderate abnormality, etc.).

[0118] The labeling categories take into account common failure modes (such as imbalance, misalignment, bearing damage, etc.) and are consistent with subsequent health prediction tasks to ensure that the labels accurately reflect the evolution of equipment status.

[0119] The dataset is divided into training, validation and test sets in chronological order for model training and evaluation. The training set contains health status trajectories of multiple samples to support cluster center trajectory extraction and prediction model learning.

[0120] (2) Explanation of the training and parameter update process of the equipment health prediction model:

[0121] The equipment health prediction model is trained by iteratively optimizing the adaptive weighted loss function. The training process first initializes all trainable parameters. The model uses the Adam optimizer to minimize the weighted mean squared error loss. The learning rate is set to 0.001, and the batch size is adjusted according to hardware resources.

[0122] In each training batch, the input vibration data is denoised and enhanced by multi-scale wavelet transform, and then sequentially passed through multi-scale convolutional feature extraction, a gated recurrent network with trajectory context enhancement, and a hidden state-trajectory temporal attention module to finally output the predicted health status.

[0123] When calculating the loss, the weights are dynamically adjusted based on the sample health distance score, and samples with poor health are given a higher loss contribution to alleviate the class imbalance problem.

[0124] Parameter updates are performed by calculating gradients through backpropagation and applying gradient clipping to prevent gradient explosion.

[0125] During training, the loss is monitored on the validation set after each iteration. The iteration is stopped when the validation set loss stops decreasing for 10 consecutive iterations or reaches the preset maximum training iterations of 1000, to ensure that the model converges fully and does not overfit. Finally, the model parameters with the best performance on the validation set are saved for prediction.

[0126] After model training, when applying equipment health prediction based on cluster center trajectories to new equipment monitoring data, the process begins by performing adaptive denoising and feature enhancement using multi-scale wavelet transform on the real-time collected vibration data to generate a denoised and enhanced vibration data vector. Then, the cluster center trajectories stored in the pre-trained model are loaded; these trajectories are obtained through typical health state evolution paths extracted during the training phase. In the trajectory-aware attention mechanism, the cosine similarity and dynamic time-warped distance between the input vector and each cluster center are calculated to generate attention weights, which are then weighted and summed to obtain a trajectory context vector, serving as supplementary features to the input data. Next, a gated recurrent network with multi-scale convolution and trajectory enhancement processes the enhanced vibration data, initializing the hidden state with the trajectory context vector. Multi-scale features and sequence dependencies are fused using multi-branch dilated convolution and gated recurrent units. Finally, a trajectory-enhanced sequence summary vector is generated using the hidden state-trajectory temporal attention weights and concatenated with the multi-scale features. The predicted health value is then output through regression via a fully connected layer.

[0127] The entire prediction process requires no retraining, relying solely on pre-trained model parameters and cluster center trajectories to achieve efficient and accurate real-time assessment of equipment health status and provide support for equipment maintenance decisions.

[0128] In one embodiment, a comprehensive performance evaluation and comparative analysis are conducted to verify the overall performance of the proposed device health prediction method based on cluster center trajectories in practical applications. This experiment systematically compares the performance of the proposed method with four current mainstream prediction methods in the device health prediction task using multi-dimensional performance indicators and visualization charts. These methods include convolutional neural networks and long short-term memory networks combined with attention mechanisms, traditional wavelet transform combined with support vector machines, deep autoencoders, and random forest ensemble methods. These comparative methods represent the mainstream technical routes in the current field of device health prediction. The convolutional neural network and long short-term memory network combined with attention mechanisms can capture local features and long-term dependencies of time series data. The traditional wavelet transform combined with support vector machines extracts frequency domain features using wavelet transform and then uses support vector machines for regression prediction. Deep autoencoders extract features through unsupervised learning before prediction, and the random forest ensemble method is an ensemble learning algorithm based on decision trees.

[0129] In terms of experimental configuration, all methods used the same equipment to run the vibration dataset for training and testing. This dataset contains complete lifecycle data from multiple devices, ranging from healthy to faulty states, with each sample containing 8192 data points. The dataset was divided into training, validation, and test sets in chronological order to ensure fairness in the evaluation. The model parameters in this embodiment were set as follows: maximum decomposition scale of multi-scale wavelet transform was 8, number of clusters was 5, trajectory fusion coefficient was 0.1, and weight scaling factor was 2.0. The parameters of the comparative methods were all optimized through grid search to achieve their respective optimal performance.

[0130] The experimental results are presented in four subplots illustrating the performance of different methods. The first subplot (e.g.) Figure 2 This diagram compares the predicted health status trends of various methods with the actual health status throughout the entire equipment operation time. The experimental data uniformly uses accelerated equipment failure data, obtained through accelerated simulations of artificially constructed failure defects. The health status at the end of the failure phase is uniformly marked as 0.1 (indicating near-scrap condition). The horizontal axis represents equipment operation time in hours, and the vertical axis represents the health status score, ranging from 0 to 1. The actual health status is represented by a thick black solid line, the method in this embodiment is represented by a blue solid line, and other comparative methods are represented by dashed lines of different colors and line types. It is evident that the prediction curve of the method in this embodiment is closest to the actual health status curve, especially during the rapid decline in health status in the later stages of equipment operation. This method can more accurately track the trend of health status changes, while other methods exhibit varying degrees of prediction deviation or lag. The diagram also uses different colored background areas to indicate health status intervals: green areas represent healthy states, yellow areas represent states requiring attention, and orange and red areas represent warning and failure states, respectively, helping to intuitively understand the process of equipment status changes.

[0131] The second subgraph (e.g.) Figure 3 The method described in this embodiment was compared with the best-performing comparative method (a method combining convolutional neural networks and long short-term memory networks with an attention mechanism) across five key dimensions using a radar chart. These five dimensions include prediction accuracy, early warning capability, stability, noise resistance, and computational efficiency. Each dimension is scored from zero to one, with higher scores indicating better performance. The blue area in the chart represents the method described in this embodiment, while the pink area represents the comparative method. It is clear that the method described in this embodiment significantly outperforms the comparative method in prediction accuracy, early warning capability, stability, and noise resistance, only slightly lagging behind in computational efficiency, but still maintaining a high level. This comprehensive performance comparison fully demonstrates the significant advantages of the method described in this embodiment in maintaining high computational efficiency while achieving significant advantages in other key performance indicators.

[0132] The third subgraph (e.g.) Figure 4The figure shows the distribution of prediction errors, specifically the absolute values ​​of the errors for each method. The median, upper and lower quartiles, and outliers in the box plot all reflect the distribution characteristics of the errors. As can be seen from the figure, the method in this embodiment corresponds to the lowest box position and the shortest box length, indicating that its median prediction error is the smallest and the distribution is the most concentrated. Simultaneously, the method in this embodiment has significantly fewer outliers than other methods, indicating that its prediction stability is the best. In contrast, the traditional wavelet transform combined with support vector machine method not only has the highest median error but also the widest error distribution range, exhibiting poor prediction stability.

[0133] The fourth subgraph (such as...) Figure 5 The bar chart shows the overall performance score of each method, which comprehensively considers multiple indicators such as prediction accuracy, early warning capability, stability, noise resistance, and computational efficiency. The bars are arranged from highest to lowest overall score, with the method in this embodiment at the top and its score significantly higher than other methods, fully demonstrating the significant advantage of the method in overall performance.

[0134] Experimental results show that the equipment health prediction method based on cluster center trajectory proposed in this embodiment is superior to the current mainstream methods in terms of key indicators such as prediction accuracy, early warning capability, and stability, providing more reliable technical support for predictive maintenance of industrial equipment.

[0135] Based on the above method embodiments, this application also provides a device for predicting device health based on cluster center trajectories, such as... Figure 6 As shown, the device includes: a data acquisition module 602, used to collect vibration data during device operation in real time; a noise reduction and feature enhancement module 604, used to perform adaptive noise reduction and feature enhancement on the vibration data based on multi-scale wavelet transform to obtain a noise-reduced and enhanced vibration data vector; a context vector determination module 606, used to calculate the similarity weight between the noise-reduced and enhanced vibration data vector and the cluster center trajectory stored in the pre-trained model, dynamically adjust the attention distribution, and determine the trajectory context vector; wherein, the cluster center trajectory is obtained by extracting the cluster center trajectory based on dynamic time warping from the noise-reduced and enhanced vibration data vector in the training dataset; a hidden state sequence output module 608, used to extract local multi-scale features of the noise-reduced and enhanced vibration data vector through multi-branch dilated convolution, combine it with gated recurrent units to model sequence dependencies, and use the trajectory context vector as the initial state of the gated recurrent unit to output a hidden state sequence with enhanced perception of health evolution; and a health prediction module 610, used to calculate the temporal attention weight between the hidden state sequence and the cluster center trajectory, generate a trajectory-aware sequence summary vector, and combine multi-scale features to predict the health of the device through regression of a fully connected layer.

[0136] Furthermore, the aforementioned denoising and feature enhancement module 604 is used to perform multi-scale wavelet decomposition and adaptive soft thresholding denoising on the vibration data to obtain soft threshold wavelet coefficients; and to perform wavelet reconstruction and feature enhancement weighted fusion on the soft threshold wavelet coefficients to obtain a denoised and enhanced vibration data vector.

[0137] Furthermore, the above-mentioned cluster center trajectories are obtained as follows: Based on the noise-reduced and enhanced vibration data vectors in the training dataset, the dynamic time warp distance between all sample pairs is calculated to form a dynamic time warp distance matrix; based on the dynamic time warp distance matrix, hierarchical clustering is performed to extract cluster centers, and the formation trajectories of the cluster centers are arranged in the order of equipment operation time to output a set of cluster center trajectories to characterize the typical evolution path of equipment health status.

[0138] Furthermore, the aforementioned context vector determination module 606 is used to calculate the cosine similarity between the noise-reduced and enhanced vibration data vector and each cluster center, and to calculate the attention weight of each cluster center by combining the average local signal-to-noise ratio and the dynamic time warping distance; based on the attention weight of each cluster center, the cluster center trajectories are weighted and summed, and adjusted by combining the feature enhancement weights to generate a trajectory context vector that integrates global information of the health status trajectory.

[0139] Furthermore, the aforementioned hidden state sequence output module 608 is used to apply multi-branch dilated convolution processing to the noise-reduced and enhanced vibration data vector. By adjusting the threshold mean and standard deviation, it adapts to vibration features at different scales and outputs multi-scale feature maps that capture vibration patterns under different receptive fields. Each branch uses a convolution kernel with a different dilation rate. The multi-scale feature maps are spliced ​​in the channel dimension to generate a spliced ​​multi-scale feature map that integrates multi-scale spatiotemporal features, thereby integrating local details and global trend information at different scales. The hidden state is initialized with the linearly transformed trajectory context vector. Feature enhancement weights and trajectory context vectors are introduced into the gated recurrent unit. By dynamically adjusting the processing intensity of historical information by the update gate and reset gate, the modeling of long-term dependencies in the vibration sequence and the fusion of prior knowledge of health state trajectory are realized, and a hidden state sequence with enhanced perception of health evolution is output.

[0140] Furthermore, the aforementioned health prediction module 610 is used to calculate the cosine similarity between the hidden state at each time step and all cluster centers, and to calculate the temporal attention weight by combining the inverse of the dynamic time warping distance; the cluster center trajectories are weighted and summed based on the temporal attention weights, and then fused with the hidden state sequence through a gating mechanism to generate a trajectory-enhanced sequence summary vector to capture the health status evolution information of the entire time series; the sequence summary vector is concatenated with the max pooling result of the multi-scale convolutional feature map, and the health is predicted through a fully connected layer regression, outputting the predicted health of the device.

[0141] Furthermore, the model loss value of the above model is calculated as follows: the health distance score is calculated based on the minimum dynamic time regularization distance between the sample and all cluster centers; the weighted mean square error loss is calculated based on the health distance score and the sample prediction error to obtain the model loss value.

[0142] The device provided in this application embodiment has the same implementation principle and technical effect as the aforementioned method embodiment. For the sake of brevity, any parts of the device embodiment not mentioned can be referred to the corresponding content in the aforementioned method embodiment.

[0143] This application also provides an electronic device, such as... Figure 7 The diagram shows the structure of the electronic device, which includes a processor 71 and a memory 70. The memory 70 stores computer-executable instructions that can be executed by the processor 71, and the processor 71 executes the computer-executable instructions to implement the above-described method.

[0144] exist Figure 7 In the illustrated embodiment, the electronic device further includes a bus 72 and a communication interface 73, wherein the processor 71, the communication interface 73, and the memory 70 are connected via the bus 72.

[0145] The memory 70 may include high-speed random access memory (RAM) and may also include non-volatile memory, such as at least one disk storage device. Communication between this system network element and at least one other network element is achieved through at least one communication interface 73 (which can be wired or wireless), such as the Internet, wide area network, local area network, metropolitan area network, etc. The bus 72 may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture) bus, etc. The bus 72 can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, Figure 7 The symbol is represented by a single double-headed arrow, but this does not mean that there is only one bus or one type of bus.

[0146] The processor 71 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed by the integrated logic circuitry in the hardware of the processor 71 or by instructions in software form. The processor 71 can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; it can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in the embodiments of this application can be directly embodied in the execution of a hardware decoding processor, or can be executed by a combination of hardware and software modules in the decoding processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in the memory, and the processor 71 reads the information in the memory and, in conjunction with its hardware, completes the steps of the method described in the foregoing embodiment.

[0147] This application also provides a computer-readable storage medium storing computer-executable instructions. When the computer-executable instructions are called and executed by a processor, the computer-executable instructions cause the processor to implement the above-described method. For specific implementation details, please refer to the foregoing method embodiments, which will not be repeated here.

[0148] The computer program products of the methods, apparatus, and electronic devices provided in the embodiments of this application include a computer-readable storage medium storing program code. The instructions included in the program code can be used to execute the methods described in the preceding method embodiments. For specific implementations, please refer to the method embodiments, which will not be repeated here.

[0149] Unless otherwise specifically stated, the relative steps, numerical expressions, and values ​​of the components and steps described in these embodiments do not limit the scope of this application.

[0150] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a processor-executable, non-volatile, computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0151] In the description of this application, it should be noted that the terms "center," "upper," "lower," "left," "right," "vertical," "horizontal," "inner," and "outer," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are used only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on this application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and should not be construed as indicating or implying relative importance.

[0152] Finally, it should be noted that the above-described embodiments are merely specific implementations of this application, used to illustrate the technical solutions of this application, and not to limit them. The protection scope of this application is not limited thereto. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that any person skilled in the art can still modify or easily conceive of changes to the technical solutions described in the foregoing embodiments, or make equivalent substitutions for some of the technical features, within the technical scope disclosed in this application. Such modifications, changes, or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be covered within the protection scope of this application. Therefore, the protection scope of this application should be determined by the protection scope of the claims.

Claims

1. A method for predicting equipment health based on cluster center trajectories, characterized in that, The method includes: Real-time acquisition of vibration data during equipment operation; The vibration data is subjected to adaptive denoising and feature enhancement based on multi-scale wavelet transform to obtain a denoised and enhanced vibration data vector. The similarity weights between the denoised and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model are calculated, and the attention distribution is dynamically adjusted to determine the trajectory context vector. The cluster center trajectories are obtained as follows: based on the denoised and enhanced vibration data vectors in the training dataset, the dynamic time-warped distance between all sample pairs is calculated to form a dynamic time-warped distance matrix; based on the dynamic time-warped distance matrix, hierarchical clustering is performed to extract cluster centers, and the formation trajectories of the cluster centers are arranged according to the equipment's operating time order, outputting a set of cluster center trajectories to characterize the typical evolution path of the equipment's health status. Local multi-scale features of the denoised and enhanced vibration data vector are extracted using multi-branch dilated convolution. Sequence dependencies are modeled using a gated recurrent unit (GRU), and the trajectory context vector is used as the initial state of the GRU. The output is a latent state sequence that enhances the perception of health evolution. This process includes: applying multi-branch dilated convolution to the denoised and enhanced vibration data vector; adjusting the threshold mean and standard deviation to adapt to vibration features at different scales; and outputting multi-scale feature maps that capture vibration patterns under different receptive fields. Each branch uses a convolution kernel with a different dilation rate. The multi-scale feature maps are then concatenated along the channel dimension to generate a concatenated multi-scale feature map that integrates multi-scale spatiotemporal features, thereby integrating local details and global trend information at different scales. The latent state is initialized with the linearly transformed trajectory context vector. Feature enhancement weights and the trajectory context vector are introduced into the GRU. By dynamically adjusting the processing intensity of the update gate and reset gate on historical information, the modeling of long-term dependencies in the vibration sequence and the fusion of prior knowledge of the health state trajectory are achieved, resulting in a latent state sequence that enhances the perception of health evolution. The temporal attention weights of the hidden state sequence and the cluster center trajectory are calculated to generate a trajectory-aware sequence summary vector. The health of the device is then predicted by regression through a fully connected layer, which combines multi-scale features.

2. The method according to claim 1, characterized in that, The steps of performing adaptive denoising and feature enhancement based on multi-scale wavelet transform on the vibration data to obtain a denoised and enhanced vibration data vector include: The vibration data is subjected to multi-scale wavelet decomposition and adaptive soft thresholding denoising to obtain soft threshold wavelet coefficients; The soft-threshold wavelet coefficients are reconstructed by wavelet and feature-enhanced weighted fusion to obtain a noise-reduced and enhanced vibration data vector.

3. The method according to claim 1, characterized in that, The steps of calculating the similarity weights between the denoised and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model, dynamically adjusting the attention distribution, and determining the trajectory context vector include: The cosine similarity between the noise-reduced and enhanced vibration data vector and each cluster center is calculated, and the attention weight of each cluster center is calculated by combining the average local signal-to-noise ratio and the dynamic time warping distance. Based on the attention weight of each cluster center, the trajectories of the cluster centers are weighted and summed, and then adjusted in conjunction with feature enhancement weights to generate a trajectory context vector that integrates global information of the health status trajectory.

4. The method according to claim 1, characterized in that, The steps of calculating the temporal attention weights between the hidden state sequence and the cluster center trajectory, generating a trajectory-aware sequence summary vector, and predicting the health of the device through regression using a fully connected layer and multi-scale features include: Calculate the cosine similarity between the hidden state at each time step and all cluster centers, and combine it with the inverse of the dynamic time warping distance to calculate the temporal attention weights; The cluster center trajectories are weighted and summed based on the temporal attention weights, and then fused with the latent state sequence through a gating mechanism to generate a trajectory-enhanced sequence summary vector, so as to capture the health status evolution information of the entire time series. The sequence summary vector is concatenated with the max pooling result of the multi-scale convolutional feature map, and the health of the device is predicted by regression through a fully connected layer. The predicted health of the device is then output.

5. The method according to claim 1, characterized in that, The model loss value is calculated as follows: the health distance score is calculated based on the minimum dynamic time regularization distance between the sample and all cluster centers; the weighted mean square error loss is calculated based on the health distance score and the sample prediction error to obtain the model loss value.

6. A device for predicting equipment health based on cluster center trajectories, characterized in that, The device includes: The data acquisition module is used to collect vibration data of the equipment in real time during operation; The noise reduction and feature enhancement module is used to perform adaptive noise reduction and feature enhancement on the vibration data based on multi-scale wavelet transform to obtain a noise-reduced and enhanced vibration data vector. The context vector determination module is used to calculate the similarity weight between the denoised and enhanced vibration data vector and the cluster center trajectories stored in the pre-trained model, dynamically adjust the attention distribution, and determine the trajectory context vector. The cluster center trajectories are obtained as follows: based on the denoised and enhanced vibration data vectors in the training dataset, the dynamic time-warped distance between all sample pairs is calculated to form a dynamic time-warped distance matrix; based on the dynamic time-warped distance matrix, hierarchical clustering is performed to extract cluster centers, and the formation trajectories of the cluster centers are arranged according to the equipment's operating time order, outputting a set of cluster center trajectories to characterize the typical evolution path of the equipment's health status. The hidden state sequence output module is used to extract local multi-scale features of the denoised and enhanced vibration data vector through multi-branch dilated convolution, model sequence dependencies by combining a gated recurrent unit, and use the trajectory context vector as the initial state of the gated recurrent unit to output a hidden state sequence with enhanced perception of health evolution. This includes: applying multi-branch dilated convolution to the denoised and enhanced vibration data vector, adjusting the threshold mean and standard deviation to adapt to vibration features at different scales, and outputting multi-scale feature maps capturing vibration patterns under different receptive fields; wherein each branch uses a convolution kernel with a different dilation rate; stitching the multi-scale feature maps along the channel dimension to generate a stitched multi-scale feature map that integrates multi-scale spatiotemporal features to integrate local details and global trend information at different scales; initializing the hidden state with the linearly transformed trajectory context vector, introducing feature enhancement weights and the trajectory context vector into the gated recurrent unit, and dynamically adjusting the processing intensity of the update gate and reset gate on historical information to achieve the modeling of long-term dependencies in the vibration sequence and the fusion of prior knowledge of the health state trajectory, outputting a hidden state sequence with enhanced perception of health evolution. The health prediction module is used to calculate the temporal attention weights of the hidden state sequence and the cluster center trajectory, generate a trajectory-aware sequence summary vector, and predict the health of the device by combining multi-scale features through fully connected layer regression.

7. An electronic device, characterized in that, The method includes a processor and a memory, the memory storing computer-executable instructions executable by the processor, the processor executing the computer-executable instructions to implement the method of any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions that, when invoked and executed by a processor, cause the processor to perform the method described in any one of claims 1 to 5.