A device state detection method, apparatus, device and medium

By combining adaptive frequency band weighting and temporal attention masking with wavelet packet decomposition and energy entropy calculation, non-stationary signal features are dynamically captured, solving the problems of frequency band energy imbalance and noise coupling in vibration signal processing in traditional methods, and achieving earlier fault warning and higher classification accuracy.

CN122241197APending Publication Date: 2026-06-19SHANDONG ENERGY DIGITAL CLOUD TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANDONG ENERGY DIGITAL CLOUD TECH CO LTD
Filing Date
2026-05-25
Publication Date
2026-06-19

Smart Images

  • Figure CN122241197A_ABST
    Figure CN122241197A_ABST
Patent Text Reader

Abstract

This invention provides a method, apparatus, device, and medium for equipment condition detection, relating to the field of equipment condition monitoring technology. The method includes: inputting vibration data of rotating components in the equipment into a multi-band feature fusion module, performing adaptive frequency band weighting and temporal attention masking to obtain weighted fusion features; inputting the weighted fusion features into a wavelet packet entropy feature mapping module, performing transient feature and frequency band energy analysis to obtain frequency band energy entropy features; inputting the frequency band energy entropy features into a frequency band adaptive convolution module, performing depthwise separable convolution operations using dynamic convolution kernels to obtain a feature map; inputting the feature map into a dual attention enhancement module, optimizing the feature map through dual-path weighting of frequency bands and channels to obtain a dual-path attention enhanced feature map; and inputting the dual-path attention enhanced feature map into a maximum margin prototype classifier to obtain the equipment condition detection result. The technical solution of this invention can improve the accuracy of condition detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of equipment condition monitoring technology, and in particular to an equipment condition monitoring method, apparatus, device, and medium. Background Technology

[0002] With the increasing scale of industrial equipment, especially the critical rotating components such as bearings, gearboxes, and motor shafts, vibration signals, as the most common health monitoring method, have been widely used in equipment fault diagnosis. Vibration data can reflect abnormal conditions that occur during equipment operation. By analyzing changes in vibration signals, potential equipment failures can be predicted in advance, allowing for effective maintenance measures to be taken and avoiding downtime losses and safety hazards caused by equipment failures.

[0003] Currently, traditional methods for detecting the vibration state of rotating machinery typically employ Fast Fourier Transform (FFT), Short-Time Fourier Transform (SFT), or Wavelet Transform (WFT) to extract time-frequency domain features, followed by classification using Support Vector Machines (SVMs), Random Forests (RWF), or Convolutional Neural Networks (CNNs). However, methods like FFT cannot effectively handle the uneven energy distribution and noise coupling in vibration signals, making it difficult to accurately capture minute defect features. Traditional time-frequency analysis methods suffer from a conflict in resolution between the time and frequency domains, making it difficult to effectively analyze transient features and frequency energy distribution in non-stationary signals, especially in the early stages of a fault, where traditional methods struggle to capture abnormal energy accumulation. Traditional CNNs use fixed convolution kernels, which prevents them from dynamically adapting to frequency band variations in vibration signals, resulting in poor feature extraction, particularly when dealing with vibration signals exhibiting significant differences in characteristics across different frequency bands. Existing classification methods, such as the Softmax classifier, are susceptible to intra-class volatility and inter-class similarity, leading to blurred decision boundaries, especially between healthy states and early degradation, where class mixing can occur, thus affecting classification accuracy. Summary of the Invention

[0004] In view of this, the purpose of this invention is to provide a method, apparatus, device, and medium for equipment status detection. By combining adaptive frequency band weighting and temporal attention masking, noise interference is effectively suppressed and the extraction of weak fault features is enhanced. By combining wavelet packet decomposition and energy entropy calculation, transient features and frequency band energy distribution in non-stationary signals are dynamically captured. A dynamic frequency band convolution kernel is constructed, which can adaptively adjust frequency response characteristics to better adapt to feature changes in different frequency bands. Through the maximum interval prototype learning strategy, the clustering of similar samples and the inter-class separation are strengthened, effectively improving the accuracy and robustness of vibration data classification.

[0005] In a first aspect, embodiments of the present invention provide a device status detection method, the method comprising: Acquire vibration data of rotating parts in the equipment; The vibration data is input into the multi-band feature fusion module, where adaptive frequency band weighting and temporal attention masking are applied to the vibration data to obtain weighted fusion features. The weighted fusion features are input into the wavelet packet entropy feature mapping module, and transient features and frequency band energy analysis are performed on the weighted fusion features to obtain the frequency band energy entropy features; The frequency band energy entropy feature is input into the frequency band adaptive convolution module, and depth-separable convolution operation is performed using dynamic convolution kernels to obtain the feature map; The feature map is input into the dual attention enhancement module, and the feature map is optimized by dual path weights of frequency band and channel to obtain the dual path attention enhanced feature map; The dual-path attention-enhanced feature map is input into the maximum margin prototype classifier to obtain the device status detection result.

[0006] In a preferred embodiment of the present invention, the above-mentioned input of the vibration data into the multi-band feature fusion module, and the adaptive frequency band weighting and temporal attention masking processing of the vibration data to obtain weighted fusion features, includes: Acquire the standard deviation data and sharpening factor of vibration training data at multiple frequency points; The adaptive frequency band weights are determined based on the standard deviation data and the sharpening factor. The vibration data is subjected to global average pooling and linear transformation to generate a temporal attention mask; The weighted fusion features are determined based on the adaptive frequency band weights and the temporal attention mask.

[0007] In a preferred embodiment of the present invention, the above-mentioned input of the weighted fused features into the wavelet packet entropy feature mapping module, and the transient feature and frequency band energy analysis of the weighted fused features to obtain the frequency band energy entropy features, includes: The weighted fusion features are subjected to wavelet packet decomposition, and the energy entropy corresponding to at least one decomposition layer is calculated. The adaptive weights corresponding to each decomposition layer are determined based on the maximum number of decomposition layers and the attenuation coefficient of the wavelet packet decomposition. The frequency band energy entropy characteristics are determined based on the energy entropy and adaptive weights corresponding to each decomposition layer.

[0008] In a preferred embodiment of the present invention, the above-mentioned input of the frequency band energy entropy feature into the frequency band adaptive convolution module, and the use of a dynamic convolution kernel to perform depthwise separable convolution operation to obtain a feature map, includes: Obtain the dynamic frequency band convolution kernels corresponding to each network layer in the frequency band adaptive convolution module; wherein, the frequency band adaptive convolution module is a deep convolutional neural network; Based on the dynamic frequency band convolution kernel and the frequency band energy entropy feature, a depthwise separable convolution operation is performed to determine the feature map.

[0009] In a preferred embodiment of the present invention, the above-mentioned input of the feature map into the dual attention enhancement module, and the optimization of the feature map through frequency band and channel dual path weights to obtain the dual path attention enhancement feature map, includes: The feature map is subjected to global average pooling and then input into a trainable weight matrix for transformation to obtain a band attention weight vector and a channel attention weight vector. Based on the frequency band attention weight vector, the channel attention weight vector, and the feature map, a dual-path attention enhancement feature map is determined.

[0010] In a preferred embodiment of the present invention, the above-mentioned inputting the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result includes: The dual-path attention-enhanced feature map is input into the feature encoder to obtain the encoded feature vector; The device status detection result is determined based on the encoded feature vector and the category prototype vector.

[0011] In a preferred embodiment of the present invention, before acquiring the vibration data of the rotating component in the device, the method further includes: Obtain sample vibration data of rotating parts in the equipment; The sample vibration data is input into the initial equipment state detection model, and the initial equipment state detection model is trained to obtain the equipment state detection model. The equipment state detection model includes a multi-band feature fusion module, a wavelet packet entropy feature mapping module, a frequency band adaptive convolution module, a dual attention enhancement module, and a maximum margin prototype classifier.

[0012] Secondly, embodiments of the present invention also provide a device for detecting device status, comprising: Input unit, used to acquire vibration data of rotating parts in the equipment; The feature fusion unit is used to input the vibration data into the multi-band feature fusion module, perform adaptive frequency band weighting and temporal attention masking on the vibration data, and obtain weighted fusion features. The wavelet packet processing unit is used to input the weighted fused features into the wavelet packet entropy feature mapping module, and perform transient feature and frequency band energy analysis on the weighted fused features to obtain the frequency band energy entropy features; The convolutional unit is used to input the frequency band energy entropy feature into the frequency band adaptive convolution module, and to perform depth-separable convolution operation using a dynamic convolution kernel to obtain the feature map; The attention enhancement unit is used to input the feature map into the dual attention enhancement module, and optimize the feature map through dual path weights of frequency band and channel to obtain a dual-path attention enhanced feature map; The classification unit is used to input the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result.

[0013] Thirdly, embodiments of the present invention also provide an electronic device, including a processor and a memory, wherein the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the device state detection method of the first aspect described above.

[0014] Fourthly, embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions, which, when invoked and executed by a processor, cause the processor to implement the device state detection method of the first aspect described above.

[0015] The embodiments of the present invention bring the following beneficial effects: This invention provides a device condition detection method. Through the combined processing of adaptive frequency band weighting and temporal attention masking, it achieves dual enhancement of vibration data in both the frequency and time domains, significantly improving the detection limit for minute defects such as early bearing cracks and early gear wear, and avoiding the feature masking problem caused by noise coupling with fault features in traditional methods. By combining wavelet packet decomposition with energy entropy calculation, it enables refined multi-resolution analysis of non-stationary signals, overcoming the time / frequency resolution contradiction problem of short-time Fourier transform. By calculating the energy entropy of each decomposition layer, the degree of disorder in the frequency band energy distribution is quantified, achieving earlier fault warning capabilities than traditional time-frequency analysis. Employing a dynamic frequency band convolution kernel, it can adaptively adjust the frequency response characteristics according to the actual frequency band distribution of the input signal, exhibiting stronger feature extraction capabilities and generalization performance when processing vibration data under different operating conditions or at different degradation stages. Using a maximum margin prototype learning strategy, compared with the Softmax classifier, it strengthens the clustering of similar samples and the separation between classes, effectively improving the accuracy and robustness of vibration data classification.

[0016] Other features and advantages of the invention will be set forth in the following description, or some features and advantages may be inferred from the description or determined without doubt, or may be learned by practicing the techniques described above.

[0017] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, preferred embodiments are described below in detail with reference to the accompanying drawings. Attached Figure Description

[0018] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0019] Figure 1 A flowchart of a device status detection method provided in an embodiment of the present invention; Figure 2a A flowchart of another device status detection method provided in an embodiment of the present invention; Figure 2b A comparison of the classification accuracy of different methods during the equipment degradation stage; Figure 2c This is a schematic diagram of the feature space distribution of the traditional method; Figure 2d This is a schematic diagram of the feature space distribution of the technical solution in an embodiment of the present invention; Figure 3 This is a schematic diagram of the structure of a device status detection device provided in an embodiment of the present invention; Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present invention. Detailed Implementation

[0020] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0021] With the increasing scale of industrial equipment, especially the growing importance of critical rotating components such as bearings, gearboxes, and motor shafts, vibration data, as the most common health monitoring method, has been widely used in equipment fault diagnosis. Vibration data can reflect abnormal conditions that occur during equipment operation. By analyzing changes in vibration data, potential equipment failures can be predicted in advance, allowing for effective maintenance measures to be taken and avoiding downtime losses and safety hazards caused by equipment failures.

[0022] Traditional methods typically employ Fast Fourier Transform (FFT), Short-Time Fourier Transform (SFT), or Wavelet Transform (WFT) to extract time-frequency domain features, then combine them with Support Vector Machines (SVMs), Random Forests (RWF), or Convolutional Neural Networks (CNNs) for classification. However, vibration data from rotating machinery often contains multiple physical sources (such as gear meshing, bearing rolling, and shaft imbalance), with significant differences in energy distribution across different frequency bands, and strong background noise easily couples with critical fault frequency bands. Traditional FFTs or fixed filtering methods struggle to adaptively suppress noise and amplify weak fault features, making it difficult to effectively capture early, minor defects (such as bearing pitting and early gear cracks). Vibration data corresponding to different fault types and different degradation stages exhibit significant differences in dominant frequency bands and waveform morphology. Existing CNNs use convolutional kernels with fixed sizes and parameters, failing to dynamically adjust frequency response characteristics to adapt to these variations, resulting in insufficient adaptability and generalization ability in feature extraction. Furthermore, traditional Softmax classifiers and conventional prototype learning methods lack explicit inter-class margin constraints, easily leading to blurred decision boundaries, reducing the recognition accuracy in early degradation stages, and failing to meet the needs of refined condition monitoring throughout the entire lifecycle.

[0023] Based on this, the device status detection method provided in this embodiment of the invention effectively suppresses noise interference and enhances the extraction of weak fault features by combining adaptive frequency band weights and temporal attention masks. By combining wavelet packet decomposition and energy entropy calculation, it dynamically captures transient features and frequency band energy distribution in non-stationary signals, constructs dynamic frequency band convolution kernels, and can adaptively adjust frequency response characteristics to better adapt to feature changes in different frequency bands. Through the maximum interval prototype learning strategy, it strengthens the clustering of similar samples and the inter-class separation, effectively improving the accuracy and robustness of vibration data classification.

[0024] To facilitate understanding of this embodiment, a device status detection method disclosed in this embodiment of the invention will first be described in detail.

[0025] Example 1 This invention provides a method for detecting device status. Figure 1 This is a flowchart illustrating a device status detection method provided in an embodiment of the present invention. Figure 1 As shown, the device status detection method may include the following steps: Step S101: Obtain vibration data of rotating components in the equipment.

[0026] Rotating components refer to mechanical parts in equipment that undergo rotational motion, typically including bearings, gearboxes, motor rotors, shaft systems, and fan blades. These components are the core of rotating machinery, and their health directly affects the reliability of the entire machine.

[0027] Specifically, vibration data of rotating components can be collected using sensors deployed on rotating equipment. For example, a triaxial accelerometer deployed on the surface of the rotating component can be used to collect vibration data, continuously capturing the time-domain vibration waveform during equipment operation at a fixed sampling frequency. Velocity sensors, displacement sensors, acoustic emission sensors, and ultrasonic sensors can also be used to collect vibration data.

[0028] During the data acquisition process, the sensor converts the original analog signal into a digital signal via an industrial fieldbus and transmits it to the edge computing gateway for preliminary filtering and buffering.

[0029] Vibration data sources cover continuous vibration data of rotating components under different operating conditions (no load, rated load, overload), different speeds (low speed, medium speed, high speed), and different health conditions, ensuring complete coverage of the frequency band of equipment fault characteristics.

[0030] Step S102: Input the vibration data into the multi-band feature fusion module, perform adaptive frequency band weighting and temporal attention masking on the vibration data, and obtain weighted fusion features.

[0031] Specifically, vibration data is input into a multi-band feature fusion model, which divides the vibration data into multiple frequency sub-bands. Features of each frequency sub-band are extracted and then fused. During feature extraction, the standard deviation of each frequency sub-band is used to automatically calculate the weight coefficient of each sub-band in the frequency domain. A weight vector of the same length as the vibration data is generated in the time domain and used as a time-domain attention mask. The weighted frequency sub-bands are then multiplied element-wise with the time-domain attention mask and summed to obtain the weighted fused features. The division of frequency sub-bands can be achieved using methods such as Fourier transform or wavelet packet decomposition.

[0032] Furthermore, the vibration data is input into a multi-band feature fusion module, where adaptive frequency band weighting and temporal attention masking are performed on the vibration data to obtain weighted fusion features. This includes: acquiring the standard deviation data and sharpening factor of the vibration training data at multiple frequency points; determining the adaptive frequency band weights based on the standard deviation data and the sharpening factor; performing global average pooling and linear transformation on the vibration data to generate a temporal attention mask; and determining the weighted fusion features based on the adaptive frequency band weights and the temporal attention mask.

[0033] Vibration training data refers to the vibration data used when training the multi-terminal feature fusion module; it can also be called the training set. Based on the standard deviation data of the training set across all frequency points, and combined with a sharpening factor, an adaptive frequency band weight is calculated for each frequency point. This weight is used to amplify the frequency bands with high volatility and high discriminative power in the training data, expressed as:

[0034] In the formula, This represents the adaptive frequency band weight at the f-th frequency point, used to amplify high-discrimination frequency bands; Let f be the standard deviation of the training set at the f-th frequency point. γ is the standard deviation of the training set at the k-th frequency point, characterizing the volatility of the data at that frequency point; γ is the sharpening factor, controlling the sharpness of the weight distribution, e.g., γ=1.5; exp( ) represents the natural exponential function; F is the total number of frequency points; f is a positive integer; k is a positive integer.

[0035] It should be noted that, in order to address the issues of uneven energy distribution and strong noise interference across multiple frequency bands in vibration data, and to avoid the Fourier transform ignoring the coupling effect between key frequency bands and noise, a sharpening factor γ is used to amplify frequency bands with high volatility and high discrimination, such as fault characteristic frequency bands, during the calculation of adaptive frequency band weights. This not only suppresses noise but also enhances the frequency domain significance of minute defects, such as early bearing cracks, and raises the detection limit of weak features, resulting in unexpected technical effects.

[0036] Global average pooling is performed on the vibration data, followed by linear transformation through a one-dimensional convolutional layer, and then processed by the ReLU activation function to generate a temporal attention mask. This mask is used to extract temporal patterns from the signal, suppress temporal noise, and enhance the sparsity of the features, as shown below:

[0037] In the formula, This represents a temporal attention mask, associated with the f-th frequency point, but the actual calculation does not depend on the f-th frequency point. It is the same for all frequencies and is used to suppress temporal noise. The first in the vibration training data Vibration data for each sample; This indicates a global average pooling operation; This is the weight matrix of a one-dimensional convolutional layer, used for linear transformation; This refers to the bias term of a one-dimensional convolutional layer. This indicates a modified linear unit activation function; It is a positive integer.

[0038] It should be noted that, in order to suppress temporal noise and enhance feature sparsity, a combination of global average pooling and modified linear unit activation functions is used to improve the temporal attention mask. Automatically focusing on impactful transient events, such as the instant a gear tooth breaks, can improve the ability to capture sudden temporal features.

[0039] The vibration data is subjected to a Fast Fourier Transform to obtain its amplitude spectrum. Then, by combining adaptive frequency band weights and a generated time-domain attention mask, a summation operation is performed on all frequency points to fuse frequency and time domain information, outputting a weighted fusion feature, represented as:

[0040] In the formula, Indicates the weighted fusion feature; Indicates Fast Fourier Transform; This represents the amplitude value at the f-th frequency point; This indicates element-wise multiplication.

[0041] Vibration data acquired by sensors is characterized by uneven energy distribution across multiple frequency bands and strong noise interference. Existing techniques that directly apply Fast Fourier Transform (FFT) neglect the coupling effect between key frequency bands and noise, resulting in the masking of high-discrimination frequency bands and amplification of temporal noise during feature extraction, failing to effectively preserve minute defect features. By employing a fusion method of adaptive frequency band weighting and temporal attention masking, the feature extraction accuracy of vibration data in both the frequency and time domains is effectively improved, noise interference is suppressed, and minute defect features are preserved.

[0042] Step S103: Input the weighted fusion features into the wavelet packet entropy feature mapping module, perform transient feature and frequency band energy analysis on the weighted fusion features, and obtain the frequency band energy entropy features.

[0043] The input weighted fusion features are decomposed using wavelet packet decomposition, dividing the signal into multiple frequency bands. Then, the energy proportion of each sub-band within each decomposition layer is calculated, and the energy entropy of that layer is calculated based on information entropy theory. The entropy value reflects the degree of disorder in the energy distribution of that frequency band. In the early stages of a fault, abnormal energy accumulation occurs in specific frequency bands, manifested as a sharp drop in entropy value. Adaptive layer weights are introduced to perform a weighted summation of the entropy values ​​of each decomposition layer, ultimately outputting the frequency band energy entropy feature. This feature can effectively capture transient energy anomalies in non-stationary signals, achieving sensitive detection of early faults. The adaptive layer weights can be determined using either an attention mechanism or a mutual information mechanism.

[0044] Furthermore, the weighted fusion features are input into the wavelet packet entropy feature mapping module to perform transient feature and frequency band energy analysis on the weighted fusion features to obtain frequency band energy entropy features, including: performing wavelet packet decomposition on the weighted fusion features and calculating the energy entropy corresponding to at least one decomposition layer; determining the adaptive weights corresponding to each decomposition layer based on the maximum number of decomposition layers and the attenuation coefficient of the wavelet packet decomposition; and determining the frequency band energy entropy features based on the energy entropy and adaptive weights corresponding to each decomposition layer.

[0045] The input weighted fusion features are decomposed using wavelet packet decomposition. The energy values ​​of each sub-band within each decomposition layer are calculated, and the energy proportion probability of each sub-band within that layer is calculated based on these energy values. Then, based on information entropy theory, the energy entropy of that decomposition layer is calculated using the energy proportion probabilities, thereby characterizing the degree of disorder in the energy distribution of each frequency band. The formula for calculating energy entropy is as follows:

[0046] In the formula, Indicates the first Layer wavelet packet decomposition energy entropy characterizes the degree of disorder in the energy distribution of that layer; the larger the entropy value, the more disordered the energy distribution. The number of subbands in the wavelet packet decomposition, for example, ; It is a positive integer; It is a positive integer; Indicates the first Layer The probability of energy proportion of each subband can be described by the following formula:

[0047] in, For wavelet packet transform operators, output the first... Layer The decomposition coefficients of each subband represent the energy value of that subband. Similarly, Output the first Layer The decomposition coefficients of each subband; Represents the L2 norm; This represents the logarithmic function with base 2.

[0048] It should be noted that, in order to overcome the time / frequency resolution contradiction in traditional time-frequency analysis and capture the transient characteristics of non-stationary signals, wavelet packet decomposition energy entropy is used. Used to quantify the disorder of frequency band energy distribution; a high entropy value indicates energy dispersion.

[0049] Based on the maximum number of decomposition layers and the attenuation coefficient of wavelet packet decomposition, adaptive weights are calculated for each decomposition layer to enhance the importance of features from deeper decomposition layers with higher resolution and suppress the influence of features from shallower decomposition layers with lower resolution. The adaptive weights can be calculated using the following formula:

[0050] In the formula, Indicates the first Adaptive weights for layer decomposition are used to enhance the features of high-resolution layers; This is the index for the first wavelet packet decomposition layer; This is the index for the second wavelet packet decomposition layer; The maximum number of decomposition levels; The attenuation coefficient controls the attenuation rate of the weight as the number of layers increases, for example, .

[0051] The wavelet packet decomposition energy entropy calculated at each decomposition layer is multiplied by its corresponding adaptive weight. Then, the weighted results of all layers are summed to generate the output feature of the feature mapping function, i.e., the frequency band energy entropy feature, which can be calculated using the following formula:

[0052] In the formula, This represents the characteristic of frequency band energy entropy.

[0053] It should be noted that adaptive weights are used. Calculate the output features of the feature mapping function This can enhance the importance of the deep decomposition layer. The entropy-weighted calculation method can highlight the abnormal accumulation of frequency band energy in early faults, such as the sudden drop in energy entropy of a specific frequency band in the early stage of wear, thus achieving earlier fault warning and having unexpected technical effects.

[0054] The non-stationary nature of vibration data can lead to a contradiction between time-domain and frequency-domain resolution in time-frequency analysis methods such as short-time Fourier transform. Specifically, increasing time resolution reduces frequency resolution, and vice versa, making it difficult to effectively capture transient features and frequency band energy distribution characteristics in non-stationary vibration data. By combining wavelet packet decomposition with energy entropy calculation, the transient features and frequency band energy distribution in non-stationary signals can be dynamically captured, thus resolving the resolution contradiction problem in time-frequency analysis.

[0055] Step S104: Input the frequency band energy entropy feature into the frequency band adaptive convolution module, and perform depth-separable convolution operation using a dynamic convolution kernel to obtain the feature map.

[0056] Dynamic convolutional kernels can be constructed using low-rank decomposition strategies (such as singular value decomposition), which include a trainable frequency band weight vector to control the response intensity of the kernel to each frequency band; they can also be constructed using conditional convolution and other methods. Depthwise separable convolution refers to decomposing standard convolution into depthwise convolution (channel-wise convolution) and pointwise convolution (1×1 convolution), significantly reducing the number of parameters while maintaining expressive power.

[0057] Furthermore, the frequency band energy entropy feature is input into the frequency band adaptive convolution module, and depthwise separable convolution operation is performed using a dynamic convolution kernel to obtain a feature map, including: obtaining the dynamic frequency band convolution kernel corresponding to each network layer in the frequency band adaptive convolution module; wherein, the frequency band adaptive convolution module is a deep convolutional neural network; and performing depthwise separable convolution operation based on the dynamic frequency band convolution kernel and the frequency band energy entropy feature to determine the feature map.

[0058] A low-rank decomposition strategy is used to construct a dynamic frequency band convolution kernel. This kernel is built from a left singular vector matrix, a right singular vector matrix, and a frequency band weight vector through matrix operations. The frequency band weight vector controls the response intensity of the convolution kernel to different frequency bands, enabling the kernel to adaptively adjust its frequency response characteristics according to the frequency band. Specifically, the convolution kernel can be constructed using the following formula:

[0059] In the formula, This represents the deep convolutional neural network's... Dynamic band convolution kernels of layers; This represents the deep convolutional neural network's... The left singular vector matrix of the layer is a trainable parameter used for low-rank decomposition; This represents the deep convolutional neural network's... The right singular vector matrix of the layer is a trainable parameter; It is a positive integer; For the construction of diagonal matrices, Represented by vector A diagonal matrix with diagonal elements; The frequency band weight vector is a trainable parameter that controls the response intensity of different frequency bands. for The transpose of .

[0060] It should be noted that, in order to enable the convolution kernel to dynamically adapt to the frequency-varying characteristics of vibration data, a dynamic frequency band convolution kernel is used. Through frequency band weight vector Control the response intensity of different frequency bands.

[0061] The generated dynamic frequency band convolution kernel is applied to a depthwise separable convolution operation. This operation processes the feature map output from the previous layer and outputs the feature map of the current layer. By separating spatial and channel convolution, the number of parameters is significantly reduced while maintaining frequency band adaptability. This can be represented as:

[0062] In the formula, This represents the deep convolutional neural network's... Feature maps output by the layer; This represents the deep convolutional neural network's... Layer depth can separate convolution operations; For the deep convolutional neural network The feature map output by the layer, when hour, The input features are those of the first layer, and... ; For the deep convolutional neural network The bias vector of the layer.

[0063] Because vibration data exhibits significantly different characteristic patterns across different frequency bands, the fixed convolutional kernels of conventional convolutional neural networks cannot dynamically adjust their frequency response characteristics, making it difficult to adapt to the frequency-varying features of vibration data. This results in insufficient ability to capture frequency-varying features, affecting the accuracy and adaptability of feature extraction. This invention constructs dynamic frequency band convolutional kernels using a low-rank decomposition strategy, enabling the convolutional neural network to adaptively adjust its frequency response characteristics, thereby improving the feature extraction capability of vibration data.

[0064] Step S105: Input the feature map into the dual attention enhancement module, and optimize the feature map by frequency band and channel dual path weights to obtain the dual path attention enhancement feature map.

[0065] Specifically, the dual-attention enhancement module recalibrates the feature map's weights along both the "band dimension" and the "channel dimension" to achieve adaptive feature optimization. Specifically, it calculates the importance weight of each band on the frequency axis (or sub-band axis), amplifying the contribution of fault-related bands and suppressing redundant or noisy bands. On the channel dimension, it calculates the importance weight of each feature channel, enhancing semantically rich channels and suppressing irrelevant channels. After fusing the band attention weights and channel attention weights through operations such as broadcasting and addition, the optimized feature map is obtained by element-wise weighting, resulting in a more discriminative and robust feature map.

[0066] Furthermore, the feature map is input into the dual attention enhancement module, and the feature map is optimized by dual path weights of frequency band and channel to obtain a dual-path attention enhancement feature map, including: performing global average pooling on the feature map and inputting it into a trainable weight matrix for transformation to obtain a frequency band attention weight vector and a channel attention weight vector; and determining the dual-path attention enhancement feature map based on the frequency band attention weight vector, the channel attention weight vector and the feature map.

[0067] Global average pooling is performed on the feature map output by the frequency band adaptive convolution module to compress the spatial dimension and extract global features. Then, the pooling result is input into two different trainable weight matrices for transformation, and the two transformation results are normalized by applying the Softmax function to generate the frequency band attention weight vector and the channel attention weight vector, which can be calculated by the following formula:

[0068]

[0069] In the formula, The feature map output by the frequency band adaptive convolution module; This is the index of the last convolutional layer, i.e., the last layer of the frequency band adaptive convolutional module; This is a global average pooling operation that compresses spatial dimensions to extract global features; This is the trainable weight matrix for the frequency band attention layer; This is the trainable weight matrix for the channel attention layer; The Softmax function maps the weights to a probability distribution. This is the frequency band attention weight vector; This is the channel attention weight vector; for The transpose of .

[0070] The frequency band attention weight vector and the channel attention weight vector are broadcast and added together, and then mapped to activation coefficients using the Sigmoid activation function. Next, the activation coefficients are used to perform element-wise multiplication on the feature map output from the final convolutional layer, resulting in a feature map enhanced by dual-path attention. This can be calculated using the following formula:

[0071] in, For broadcast addition operation, By Expand to Matrix, Expand to Matrix, add element by element; The number of channels in the feature map; Use the Sigmoid activation function; This is element-wise multiplication; This is a feature map for dual-path attention enhancement.

[0072] It should be noted that, in order to achieve collaborative optimization of frequency band response and channel semantic information, the frequency band attention weight vector... and channel attention weight vector The contribution of frequency bands and channels is explicitly distinguished, and the broadcast addition operation is used in the calculation of dual-path attention-enhanced feature maps to improve the robustness of the model to load change scenarios.

[0073] In existing technologies, the attention mechanisms employed do not explicitly distinguish the correlation between frequency band information and channel features in vibration data processing. This results in a failure to coordinate the optimization of response features of key frequency bands and semantic information of important channels, leading to ambiguity in the contribution of frequency bands and channels within the feature map. This invention, through a dual-path attention mechanism, explicitly distinguishes the correlation between frequency band and channel features, coordinates the optimization of frequency band response features and channel semantic information, and improves the representational capability of vibration data feature maps.

[0074] Step S106: Input the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result.

[0075] The maximum margin prototype classifier maintains a trainable class prototype vector (cluster center) for each classification label and enforces that features of samples of the same class are close to their class prototype vectors, and that prototype vectors of different classes maintain a minimum margin, thereby obtaining a clear and robust decision boundary. The equipment status detection result refers to the classification label output by the maximum margin prototype classifier, used to describe the degradation stage of rotating parts. For example, the equipment status detection result includes healthy status, early degradation, intermediate degradation, and severe failure. Specifically, the dual-path attention-enhanced feature map is input into the maximum margin prototype classifier, which calculates the distance between the dual-path attention-enhanced feature map and each class prototype vector, and then uses the classification label corresponding to the prototype vector with the smallest distance as the equipment status detection result. The distance between the dual-path attention-enhanced feature map and each prototype vector can be Euclidean distance, cosine distance, etc.

[0076] Furthermore, the dual-path attention-enhanced feature map is input into the maximum margin prototype classifier to obtain the device status detection result, including: inputting the dual-path attention-enhanced feature map into the feature encoder to obtain the encoded feature vector; and determining the device status detection result based on the encoded feature vector and the category prototype vector.

[0077] The feature encoder can be a network structure consisting of a fully connected layer connected after global average pooling. The dual-path attention-enhanced feature map is input into the feature encoder to obtain the encoded feature vector. The squared Euclidean distance between the encoded feature vector and the prototype vector of each trainable class is calculated to characterize the similarity between the vibration data and the prototype vectors of each class, expressed as:

[0078] in, The encoded feature vector of the i-th vibration data is obtained by the feature encoder from... To obtain, for example, to After performing global average pooling, a fully connected layer is connected to output the encoded feature vector. Let be the prototype vector of the k-th sample, and be the trainable parameters, which are learned during training through gradient descent. This represents the distance metric function.

[0079] The classification label corresponding to the type prototype vector with the highest similarity (smallest squared Euclidean distance) is used as the device status detection result.

[0080] Vibration data exhibits significant intra-class variability, and conventional Softmax classifiers are sensitive to intra-class differences, leading to blurred decision boundaries. Furthermore, conventional prototype learning lacks explicit inter-class margin constraints, making it susceptible to interference from inter-class similarity in vibration data classification tasks. By employing a maximum margin prototype learning strategy, which optimizes the classification loss function, the clustering of similar samples is strengthened, ensuring inter-class separability and improving the accuracy and robustness of vibration data classification.

[0081] During training, the class prototype vectors are continuously adjusted by optimizing the classification loss function, causing the encoded feature vectors of samples of the same class to converge towards them, while maintaining a certain distance between prototype vectors of different classes. Specifically, based on the calculated prototype distance (the distance between prototype vectors of different classes), a classification loss function is constructed. The constraint of the classification loss function is that, for each sample, its distance to its true class prototype vector is as small as possible to be less than its distance to all other incorrect class prototype vectors. Therefore, by optimizing the classification loss function, features of samples of the same class are forced to cluster around their corresponding prototypes, achieving a compact effect within classes, while ensuring a minimum distance between prototypes of different classes, achieving inter-class separation. The classification function can be expressed by the following formula:

[0082] In the formula, Let i be the true category of the i-th sample; For the first The true category of each sample The corresponding prototype vector; The classification loss function; This is a distance scaling factor that controls the intensity of intra-class aggregation, such as... ; The inter-class interval is a constant, which forces the minimum separation distance of the decision boundary, such as... 5.

[0083] It should be noted that, in order to solve the problems of large intra-class fluctuations and blurred inter-class boundaries, the classification loss function controls the minimum inter-class margin through the inter-class margin constant δ, forcing samples of the same class to cluster. At the same time, the setting of the inter-class margin constant δ can also alleviate the problem of cross-device vibration feature drift and improve the generalization of the model.

[0084] In imbalanced vibration data scenarios, conventional loss functions treat all samples equally, resulting in insufficient learning of difficult samples with low confidence. Furthermore, traditional weighting methods require preset fixed weights, making it difficult to adapt to the dynamic distribution characteristics of vibration data.

[0085] This invention addresses the class imbalance problem and improves the learning performance of the classifier by dynamically calculating the loss weights of difficult samples. The specific steps are as follows: Using the dot product of the sample's encoded feature vector and all class prototype vectors, and processing it through the Softmax function to obtain the probability distribution, the prediction confidence of each sample is calculated. The prediction confidence is defined as the probability that the sample is predicted to be its most likely class, which is the maximum value output by the Softmax function, expressed as:

[0086] In the formula, This indicates iterating through all category pairs. Take the maximum value; represents the prediction confidence level for the i-th sample; a smaller value indicates a more difficult prediction.

[0087] Based on the calculated prediction confidence of each sample, a dynamic loss weight is constructed. Specifically, the dynamic loss weight consists of a smoothing coefficient and a dynamic adjustment term. The dynamic adjustment term uses a hard sample sharpening factor to exponentially transform and normalize the difficulty of the samples. The resulting dynamic loss weight is used to dynamically increase the contribution of low-confidence samples in the loss function. The dynamic loss weight can be expressed by the following formula:

[0088] In the formula, To smooth out the coefficients and ensure the basic weights, for example, ; Let be the prediction confidence level for the j-th sample; For hard samples, a sharpening factor controls the steepness of the weight distribution, such as... ; The total number of samples in the batch; is the dynamic loss weight for the i-th sample.

[0089] It should be noted that, in order to adaptively resolve the class imbalance problem, dynamic loss weights are used. It can dynamically increase the loss weight of difficult samples, and by combining the difficult sample sharpening factor λ to control the steepness of the weight distribution, it can automatically discover fault edge samples, such as composite fault samples.

[0090] Multiply the classification loss of each sample by its corresponding dynamic loss weight, and then calculate the average of the weighted losses of all samples in the current training batch. This average is used as the dynamic hard sample mining loss, expressed as:

[0091] In the formula, Let be the classification loss for the i-th sample; This is the loss function for mining dynamic hard samples.

[0092] This invention provides a device condition detection method that, through the joint processing of adaptive frequency band weighting and temporal attention masking, achieves dual enhancement of vibration data in both the frequency and time domains. This significantly improves the detection limit for minute defects such as early bearing cracks and early gear wear, avoiding the feature masking problem caused by noise coupling with fault features in traditional methods. By combining wavelet packet decomposition with energy entropy calculation, it enables refined multi-resolution analysis of non-stationary signals, overcoming the time / frequency resolution contradiction problem of short-time Fourier transform. By calculating the energy entropy of each decomposition layer, the degree of disorder in the frequency band energy distribution is quantified, achieving earlier fault warning capabilities than traditional time-frequency analysis. Employing a dynamic frequency band convolution kernel, it adaptively adjusts the frequency response characteristics according to the actual frequency band distribution of the input signal, exhibiting stronger feature extraction capabilities and generalization performance when processing vibration data under different operating conditions or at different degradation stages. Using a maximum margin prototype learning strategy, compared to the Softmax classifier, it strengthens the clustering of similar samples and the separation between classes, effectively improving the accuracy and robustness of vibration data classification.

[0093] Example 2 This invention also provides another device status detection method; this method is implemented based on the method in the above embodiments; this method focuses on describing the specific implementation of the device status detection model training process.

[0094] Figure 2a A flowchart of another device status detection method provided in an embodiment of the present invention is shown below. Figure 2a As shown, the device status detection method may include the following steps: Step S201: Obtain sample vibration data of rotating parts in the device.

[0095] Specifically, vibration data collected from rotating components of the equipment within historical time periods can be retrieved from the database and categorized to form sample vibration data. The categorization labels for the sample vibration data are based on the equipment's full lifecycle monitoring records and are manually determined by domain experts using spectral analysis, time-domain indicators, and maintenance records to classify the degradation stage of the equipment part corresponding to each sample vibration data.

[0096] The labeled classification tags (equipment status detection results) are divided into four degradation stages: healthy state (uniform vibration energy distribution, no abnormal frequency bands), early degradation (small increase in energy in specific frequency bands, sparse time-domain impact signals), mid-term degradation (significant increase in energy at fault characteristic frequencies, and manifestation of time-domain periodic impacts), and severe fault (surge in broadband energy, dense impact signals accompanied by harmonics).

[0097] The labeling process must be linked to operating condition parameters collected in the same batch to ensure that the degradation stage labels strictly correspond to the actual operating status of the equipment. Operating condition parameters are used to describe the operating status of the equipment under different operating conditions, including but not limited to: no-load, rated load, overload; low speed, medium speed, high speed; steady-state operation and start-up and shutdown processes. Fully covering different operating conditions can improve the generalization performance of the model in real-world complex operating environments.

[0098] Step S202: Input the sample vibration data into the initial equipment state detection model, train the initial equipment state detection model, and obtain the equipment state detection model.

[0099] The device status detection model includes a multi-band feature fusion module, a wavelet packet entropy feature mapping module, a frequency band adaptive convolution module, a dual attention enhancement module, and a maximum margin prototype classifier.

[0100] In this embodiment of the invention, the initial device state detection model is constructed based on a deep convolutional neural network. In existing technologies, during the training process of deep convolutional neural networks, the gradient scales of different network layers differ significantly, easily leading to gradient explosion / vanishing problems. Furthermore, traditional normalization methods use globally fixed thresholds, which are difficult to adapt to the dynamic gradient distribution characteristics of each layer, exacerbating training instability in vibration data processing. This embodiment of the invention solves the gradient explosion and vanishing problems during the training process of deep convolutional neural networks by tracking the layer gradient scale and performing normalization and pruning, ensuring the stability of the training process.

[0101] Specifically, for each layer of a deep convolutional neural network, the L2 norm of its parameter gradient vector is calculated. Then, the historical gradient magnitudes of the layer are smoothed using an exponential moving average operator to obtain a dynamic baseline value for the gradient scale of that layer. This allows for real-time tracking of the typical magnitude of the gradient at that layer. The dynamic baseline value can be calculated using the following formula:

[0102] In the formula, For the deep convolutional neural network Dynamic baseline value for layer gradient scale; The exponential moving average operator is calculated as follows: ; This represents the number of iterations. For the first The deep convolutional neural network in the nth iteration The dynamic baseline value of the layer gradient scale, and the initialization of the deep convolutional neural network's 1st layer gradient scale. Dynamic baseline value of layer gradient scale ; For the deep convolutional neural network The gradient vector of the layer parameters; As a smoothing factor, it controls the strength of historical gradient-scale memory, such as... .

[0103] For each layer of a deep convolutional neural network, its parameter gradient vector is first normalized. Then, the normalized gradient vector is multiplied by the minimum value of the product of the dynamic baseline value of the gradient scale of that layer and the pruning threshold coefficient to obtain the pruned gradient vector of that layer. This ensures that the gradient update magnitude of each layer is adapted to its historical scale baseline and limits its maximum offset, thereby stabilizing the training of deep networks. The gradient vector can be calculated using the following formula:

[0104] In the formula, For the deep convolutional neural network The gradient vector of layer clipping; This indicates the minimum value operation; The clipping threshold coefficient controls the maximum allowable gradient magnitude offset, such as... ; For the deep convolutional neural network Dynamic baseline value for layer gradient scale.

[0105] It should be noted that, to address the gradient explosion / vanishing problem in deep networks, the gradient vector is based on a dynamic baseline value. Implementing dynamic pruning gradients can avoid sensitive dependence on batch size.

[0106] Traditional optimizers struggle to adapt to the dynamic gradient distribution of vibration data feature extraction networks, especially with gradient clipping. Furthermore, fixed learning rate mechanisms are prone to getting trapped in local optima during later training phases. This invention addresses this by fusing historical gradient directions with the current clipped gradient, designing an adaptive momentum acceleration mechanism, and employing a gradient magnitude stabilization factor to control the update step size. This approach accelerates convergence while ensuring training stability.

[0107] The gradient vector of the current iteration is fused with the adaptive momentum vector of previous iterations using an exponential moving average. The fusion weights are adjusted based on a gradient magnitude stabilization factor, so that the optimization process is biased towards the historical direction to maintain stability when the gradient magnitude is large, and biased towards the current direction to accelerate convergence when the gradient magnitude is small. The adaptive momentum vector can be calculated using the following formula:

[0108]

[0109] In the formula, For the deep convolutional neural network Layer The adaptive momentum vector for the next iteration; For the deep convolutional neural network Layer The adaptive momentum vector for the next iteration is initialized to 0. For the deep convolutional neural network Layer gradient magnitude stabilization factor, which maps the relative magnitude of the gradient to a sigmoid function. interval; To stabilize the sensitivity coefficient, the sensitivity of the stabilization factor to the gradient magnitude is controlled, for example, ; It is a positive integer; It is a positive integer.

[0110] It should be noted that, in order to achieve stable training while accelerating convergence, the gradient magnitude stabilization factor... Based on the adaptive momentum vector calculation, a large gradient magnitude biases towards the historical direction, indicating stable training, while a small gradient magnitude biases towards the current direction, indicating acceleration. The gradient magnitude stabilization factor enables the optimizer to maintain stability under unsteady vibration conditions such as speed fluctuations.

[0111] Based on the adaptive momentum vector and the global loss magnitude, the parameter update step size is calculated, where the update direction is determined by the momentum vector after Nesterov acceleration correction, and the step size magnitude is constrained by both the adaptive learning rate and the loss magnitude stabilization term. The parameter update step size can be calculated using the following formula:

[0112] In the formula, For the deep convolutional neural network The parameter update step size for layer parameters; The global base learning rate controls the overall update magnitude, for example, ; Nesterov acceleration factor, correcting for momentum direction, such as... ; For loss smoothing factors, such as, ; For numerical stability constants, to prevent division by zero errors, such as, .

[0113] The calculated parameter update stride is applied to the trainable parameters of the deep convolutional neural network, expressed as:

[0114] In the formula, For the deep convolutional neural network Layer The parameters are updated in the next iteration; For the deep convolutional neural network Layer The parameters are updated in the next iteration.

[0115] During the initial training of the device status detection model, the classification accuracy on the test set is continuously monitored. When the accuracy improvement is less than 0.1% within 20 consecutive training cycles, the device status detection model is considered to have reached convergence. At the same time, a maximum number of iteration cycles is set as a forced termination threshold, such as 400 cycles, to prevent overfitting.

[0116] Step S203: Obtain vibration data of rotating components in the equipment.

[0117] Step S204: Input the vibration data into the multi-band feature fusion module, perform adaptive frequency band weighting and temporal attention masking on the vibration data, and obtain weighted fusion features.

[0118] Step S205: Input the weighted fusion features into the wavelet packet entropy feature mapping module, perform transient feature and frequency band energy analysis on the weighted fusion features, and obtain the frequency band energy entropy features.

[0119] Step S206: Input the frequency band energy entropy feature into the frequency band adaptive convolution module, and perform depth-separable convolution operation using a dynamic convolution kernel to obtain the feature map.

[0120] Step S207: Input the feature map into the dual attention enhancement module, and optimize the feature map by frequency band and channel dual path weights to obtain the dual path attention enhancement feature map.

[0121] Step S208: Input the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result.

[0122] In one possible implementation, the trained equipment condition detection model is deployed on an online monitoring system, and newly collected vibration data is input into the trained equipment condition detection model, which then outputs monitoring results.

[0123] The output of the equipment condition detection model is based on the squared Euclidean distance between the encoded feature vector and the prototype vector. Specifically, the equipment condition detection result output by the model directly corresponds to the degradation stage of the monitored part of the equipment, including healthy state, early degradation, intermediate degradation, and severe failure. These four categories are identified during the training phase through prototype vectors. ( The prototype vector is defined as representing a cluster center of a degenerate stage in the feature space.

[0124] For vibration data of new input samples, the encoded feature vector is extracted using a feature encoder. ,calculate With all category prototype vectors The square of the Euclidean distance, i.e. The category corresponding to the prototype vector with the smallest distance is taken as the predicted category. For example, The output category is "early degradation" if the distance to the early degradation prototype vector is the smallest.

[0125] The online monitoring system triggers an early warning when it detects a mid-term or higher degradation stage (i.e., mid-term degradation, severe failure).

[0126] To verify the ability of the technical solution of this invention to identify different stages of device degradation, a comparison was made with four conventional methods: traditional convolutional neural networks, support vector machines, random forests, and long short-term memory networks. Figure 2b To compare the classification accuracy of different methods during the equipment degradation stage, such as... Figure 2b As shown, in the most challenging early degradation stage, the technical solution of this invention demonstrates significant advantages. It shows that the multi-band feature fusion module amplifies weak fault features through adaptive frequency band weighting, while the temporal attention mask effectively suppresses noise interference. The wavelet packet entropy feature mapping module captures the energy anomalous aggregation phenomenon unique to early faults by quantifying the degree of frequency band energy disorder. In contrast, traditional methods are significantly inadequate in early degradation identification because they cannot solve the problems of multi-band energy imbalance and noise coupling.

[0127] Furthermore, by visualizing the distribution patterns of the high-dimensional feature space, the working principle of the maximum margin prototype classification mechanism is analyzed. Figure 2c This is a schematic diagram of the feature space distribution of traditional methods, such as... Figure 2c As shown, the feature points of traditional methods exhibit severe class mixing, especially the overlap of edge regions between healthy and early deterioration samples, which leads to distorted and complex decision boundaries, indicating the sensitivity of conventional classifiers to intra-class fluctuations. Figure 2d This is a schematic diagram of the feature space distribution of the technical solution in an embodiment of the present invention, such as... Figure 2d As shown, in the feature space of the technical solution of this embodiment of the invention, the four degradation stages each form a compact cluster structure. There is a clear interval between the healthy state and the early degradation category. The intermediate degradation and the severe failure show a linear separation. The highly ordered distribution verifies the effectiveness of the maximum interval prototype learning strategy. By strengthening intra-class clustering and inter-class separation, the identification of early degradation features is significantly improved, providing a reliable basis for accurate classification of equipment status.

[0128] The device status detection method provided in this invention solves the problem of gradient explosion and vanishing during the training process of deep convolutional neural networks by adapting the dynamic gradient distribution characteristics of different network layers to multi-scale gradients, thereby improving the stability of the training process. The adaptive momentum acceleration mechanism dynamically adjusts the update strategy through a gradient magnitude stabilization factor, enabling the model to explore rapidly in flat regions and update conservatively in steep regions, thus improving training efficiency.

[0129] Example 3 Corresponding to the above method embodiments, this invention provides a device for detecting device status. Figure 3 This is a schematic diagram of the structure of a device status detection device provided in an embodiment of the present invention, as shown below. Figure 3 As shown, the equipment status detection device may include: Input unit 301 is used to acquire vibration data of rotating parts in the device; The feature fusion unit 302 is used to input the vibration data into the multi-band feature fusion module, perform adaptive frequency band weighting and temporal attention masking on the vibration data, and obtain weighted fusion features. Wavelet packet processing unit 303 is used to input the weighted fusion features into the wavelet packet entropy feature mapping module, perform transient feature and frequency band energy analysis on the weighted fusion features, and obtain frequency band energy entropy features; Convolutional unit 304 is used to input the frequency band energy entropy feature into the frequency band adaptive convolution module, and perform depth-separable convolution operation using dynamic convolution kernel to obtain feature map; Attention enhancement unit 305 is used to input the feature map into the dual attention enhancement module, and optimize the feature map through dual path weights of frequency band and channel to obtain a dual-path attention enhancement feature map; The classification unit 306 is used to input the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result.

[0130] The equipment condition detection device provided in this invention achieves dual enhancement of vibration data in both the frequency and time domains through the joint processing of adaptive frequency band weighting and temporal attention masking. This significantly improves the detection limit of minute defects such as early bearing cracks and early gear wear, avoiding the feature masking problem caused by noise coupling with fault features in traditional methods. By combining wavelet packet decomposition with energy entropy calculation, it enables refined multi-resolution analysis of non-stationary signals, overcoming the time / frequency resolution contradiction problem of short-time Fourier transform. By calculating the energy entropy of each decomposition layer, the degree of disorder in the frequency band energy distribution is quantified, achieving earlier fault warning capabilities than traditional time-frequency analysis. Employing a dynamic frequency band convolution kernel, it can adaptively adjust the frequency response characteristics according to the actual frequency band distribution of the input signal, exhibiting stronger feature extraction capabilities and generalization performance when processing vibration data under different operating conditions or at different degradation stages. Using a maximum margin prototype learning strategy, compared with the Softmax classifier, it strengthens the clustering of similar samples and the inter-class separation, effectively improving the accuracy and robustness of vibration data classification.

[0131] In some embodiments, the feature fusion unit 302 is further configured to: Acquire the standard deviation data and sharpening factor of vibration training data at multiple frequency points; The adaptive frequency band weights are determined based on the standard deviation data and the sharpening factor. The vibration data is subjected to global average pooling and linear transformation to generate a temporal attention mask; The weighted fusion features are determined based on the adaptive frequency band weights and the temporal attention mask.

[0132] In some embodiments, the wavelet packet processing unit 303 is further configured to: The weighted fusion features are subjected to wavelet packet decomposition, and the energy entropy corresponding to at least one decomposition layer is calculated. The adaptive weights corresponding to each decomposition layer are determined based on the maximum number of decomposition layers and the attenuation coefficient of the wavelet packet decomposition. The frequency band energy entropy characteristics are determined based on the energy entropy and adaptive weights corresponding to each decomposition layer.

[0133] In some embodiments, the convolution unit 304 is further configured to: Obtain the dynamic frequency band convolution kernels corresponding to each network layer in the frequency band adaptive convolution module; wherein, the frequency band adaptive convolution module is a deep convolutional neural network; Based on the dynamic frequency band convolution kernel and the frequency band energy entropy feature, a depthwise separable convolution operation is performed to determine the feature map.

[0134] In some embodiments, the attention enhancement unit 305 is further configured to: The feature map is subjected to global average pooling and then input into a trainable weight matrix for transformation to obtain a band attention weight vector and a channel attention weight vector. Based on the frequency band attention weight vector, the channel attention weight vector, and the feature map, a dual-path attention enhancement feature map is determined.

[0135] In some embodiments, the classification unit 306 is further configured to: The dual-path attention-enhanced feature map is input into the feature encoder to obtain the encoded feature vector; The device status detection result is determined based on the encoded feature vector and the category prototype vector.

[0136] In some embodiments, the device further includes: The sample data acquisition unit is used to acquire sample vibration data of rotating parts in the equipment; The training unit is used to input the sample vibration data into the initial equipment state detection model, train the initial equipment state detection model, and obtain the equipment state detection model. The equipment state detection model includes a multi-band feature fusion module, a wavelet packet entropy feature mapping module, a frequency band adaptive convolution module, a dual attention enhancement module, and a maximum margin prototype classifier.

[0137] The device provided in this embodiment of the invention has the same implementation principle and technical effect as the aforementioned method embodiment. For the sake of brevity, any parts not mentioned in the device embodiment can be referred to the corresponding content in the aforementioned method embodiment.

[0138] Example 4 This invention also provides an electronic device for running the above-described device status detection method; see [link to previous document]. Figure 4 The diagram shows the structure of an electronic device, which includes a memory 400 and a processor 401. The memory 400 stores one or more computer instructions, which are executed by the processor 401 to implement the aforementioned device status detection method.

[0139] Furthermore, Figure 4The electronic device shown also includes a bus 402 and a communication interface 403. The processor 401, the communication interface 403 and the memory 400 are connected via the bus 402.

[0140] The memory 400 may include high-speed random access memory (RAM) and may also include non-volatile memory, such as at least one disk storage device. Communication between this system network element and at least one other network element is achieved through at least one communication interface 403 (which can be wired or wireless), such as the Internet, wide area network, local area network, metropolitan area network, etc. The bus 402 can be an ISA bus, PCI bus, or EISA bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 4 The symbol is represented by a single double-headed arrow, but this does not mean that there is only one bus or one type of bus.

[0141] Processor 401 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed by the integrated logic circuitry in the hardware of processor 401 or by instructions in software form. Processor 401 can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; it can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this invention. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this invention can be directly manifested as execution by a hardware decoding processor, or execution by a combination of hardware and software modules in the decoding processor. The software module can reside in a readily available storage medium in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory 400, and processor 401 reads information from memory 400 and, in conjunction with its hardware, completes the steps of the method described in the foregoing embodiments.

[0142] This invention also provides a computer-readable storage medium storing computer-executable instructions. When these computer-executable instructions are called and executed by a processor, they cause the processor to implement the aforementioned device status detection method. For specific implementation details, please refer to the method embodiments, which will not be repeated here.

[0143] The computer-readable storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

Claims

1. A method for detecting equipment status, characterized in that, The method includes: Acquire vibration data of rotating parts in the equipment; The vibration data is input into the multi-band feature fusion module, where adaptive frequency band weighting and temporal attention masking are applied to the vibration data to obtain weighted fusion features. The weighted fusion features are input into the wavelet packet entropy feature mapping module, and transient features and frequency band energy analysis are performed on the weighted fusion features to obtain the frequency band energy entropy features; The frequency band energy entropy feature is input into the frequency band adaptive convolution module, and depth-separable convolution operation is performed using dynamic convolution kernels to obtain the feature map; The feature map is input into the dual attention enhancement module, and the feature map is optimized by dual path weights of frequency band and channel to obtain the dual path attention enhanced feature map; The dual-path attention-enhanced feature map is input into the maximum margin prototype classifier to obtain the device status detection result.

2. The method according to claim 1, characterized in that, The vibration data is input into a multi-band feature fusion module, where adaptive frequency band weighting and temporal attention masking are applied to the vibration data to obtain weighted fusion features, including: Acquire the standard deviation data and sharpening factor of vibration training data at multiple frequency points; The adaptive frequency band weights are determined based on the standard deviation data and the sharpening factor. The vibration data is subjected to global average pooling and linear transformation to generate a temporal attention mask; The weighted fusion features are determined based on the adaptive frequency band weights and the temporal attention mask.

3. The method according to claim 1, characterized in that, The weighted fused features are input into the wavelet packet entropy feature mapping module, and transient feature and frequency band energy analysis are performed on the weighted fused features to obtain frequency band energy entropy features, including: The weighted fusion features are subjected to wavelet packet decomposition, and the energy entropy corresponding to at least one decomposition layer is calculated. The adaptive weights corresponding to each decomposition layer are determined based on the maximum number of decomposition layers and the attenuation coefficient of the wavelet packet decomposition. The frequency band energy entropy characteristics are determined based on the energy entropy and adaptive weights corresponding to each decomposition layer.

4. The method according to claim 1, characterized in that, The frequency band energy entropy feature is input into the frequency band adaptive convolution module, and depthwise separable convolution operation is performed using a dynamic convolution kernel to obtain a feature map, including: Obtain the dynamic frequency band convolution kernels corresponding to each network layer in the frequency band adaptive convolution module; wherein, the frequency band adaptive convolution module is a deep convolutional neural network; Based on the dynamic frequency band convolution kernel and the frequency band energy entropy feature, a depthwise separable convolution operation is performed to determine the feature map.

5. The method according to claim 1, characterized in that, The feature map is input into the dual-attention enhancement module, and the feature map is optimized by dual-path weighting of frequency band and channel to obtain a dual-path attention enhanced feature map, including: The feature map is subjected to global average pooling and then input into a trainable weight matrix for transformation to obtain a band attention weight vector and a channel attention weight vector. Based on the frequency band attention weight vector, the channel attention weight vector, and the feature map, a dual-path attention enhancement feature map is determined.

6. The method according to claim 1, characterized in that, The dual-path attention-enhanced feature map is input into the maximum margin prototype classifier to obtain the device status detection results, including: The dual-path attention-enhanced feature map is input into the feature encoder to obtain the encoded feature vector; The device status detection result is determined based on the encoded feature vector and the category prototype vector.

7. The method according to claim 1, characterized in that, Before acquiring vibration data of rotating components in the equipment, the following steps are also included: Obtain sample vibration data of rotating parts in the equipment; The sample vibration data is input into the initial equipment state detection model, and the initial equipment state detection model is trained to obtain the equipment state detection model. The equipment state detection model includes a multi-band feature fusion module, a wavelet packet entropy feature mapping module, a frequency band adaptive convolution module, a dual attention enhancement module, and a maximum margin prototype classifier.

8. A device for detecting equipment status, characterized in that, The device includes: Input unit, used to acquire vibration data of rotating parts in the equipment; The feature fusion unit is used to input the vibration data into the multi-band feature fusion module, perform adaptive frequency band weighting and temporal attention masking on the vibration data, and obtain weighted fusion features. The wavelet packet processing unit is used to input the weighted fused features into the wavelet packet entropy feature mapping module, and perform transient feature and frequency band energy analysis on the weighted fused features to obtain the frequency band energy entropy features; The convolutional unit is used to input the frequency band energy entropy feature into the frequency band adaptive convolution module, and to perform depth-separable convolution operation using a dynamic convolution kernel to obtain the feature map; The attention enhancement unit is used to input the feature map into the dual attention enhancement module, and optimize the feature map through dual path weights of frequency band and channel to obtain a dual-path attention enhanced feature map; The classification unit is used to input the dual-path attention-enhanced feature map into the maximum margin prototype classifier to obtain the device status detection result.

9. An electronic device, characterized in that, The device includes a processor and a memory, the memory storing computer-executable instructions that can be executed by the processor, the processor executing the computer-executable instructions to implement the device status detection method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when invoked and executed by a processor, cause the processor to implement the device status detection method according to any one of claims 1 to 7.