A bearing fault diagnosis method based on working condition decoupling

By adopting a bearing fault diagnosis method that decouples from operating conditions, and combining operating condition attribute coding and a multi-scale cascaded attention module layer, the problem of low accuracy and reliability in bearing fault diagnosis in existing technologies is solved, and more accurate fault identification and classification are achieved.

CN117109922BActive Publication Date: 2026-06-23BEIJING JINGHANG COMPUTING & COMM RES INST

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING JINGHANG COMPUTING & COMM RES INST
Filing Date
2023-08-28
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing bearing fault diagnosis methods fail to effectively integrate operating condition information, resulting in low diagnostic accuracy and reliability.

Method used

A bearing fault diagnosis method based on working condition decoupling is adopted. By acquiring bearing vibration signal data and inputting it into a trained fault diagnosis model, feature extraction and classification are performed using a working condition attribute encoding layer and a multi-scale cascaded attention module layer. The model is then trained using a quadruplet loss function to achieve working condition decoupling and feature extraction.

Benefits of technology

It improves the accuracy and reliability of bearing fault diagnosis, can better identify and distinguish different fault categories, reduces interference under varying operating conditions, and enhances fault diagnosis performance.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117109922B_ABST
    Figure CN117109922B_ABST
Patent Text Reader

Abstract

The application relates to a bearing fault diagnosis method based on working condition decoupling, and belongs to the fault diagnosis technology, and solves the problem that in the prior art, working condition information and fault data are not combined during bearing fault diagnosis, so that the diagnosis precision and reliability are not high. The method comprises the following steps: acquiring bearing vibration signal data; wherein the bearing vibration signal data comprises bearing working condition attributes and bearing vibration data; inputting the bearing vibration signal data into a trained bearing fault diagnosis model to obtain a bearing fault category; wherein the bearing fault diagnosis model comprises a data input layer, a feature extraction layer and a fault classifier tail. The working condition data and the fault data are combined, the relationship between the working condition parameters and the faults is analyzed, and the accuracy and reliability of the diagnosis are further improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of bearing fault diagnosis technology, and in particular to a bearing fault diagnosis method and system based on operating condition decoupling. Background Technology

[0002] Bearings, as a crucial component in modern machinery, are widely used. With the rapid development of modern industrial technology, machine tools are becoming increasingly precise, automated, and large-scale. Failures in bearings, which play a vital role in these machines, can impact their operation. This can range from increasing product defect rates and reducing precision to causing machine damage, downtime, delays in repair and maintenance, resulting in significant economic losses, and even endangering worker safety. Therefore, developing a method for rapid and accurate bearing fault diagnosis in various production scenarios is of great significance.

[0003] Since bearing vibration signals contain a large amount of bearing condition information, bearing fault diagnosis methods based on vibration signals have attracted widespread attention from researchers.

[0004] Existing bearing fault diagnosis methods collect data including operating condition information, but do not combine it with vibration data to diagnose faults, resulting in low accuracy and reliability of existing fault diagnosis methods. Summary of the Invention

[0005] Based on the above analysis, the embodiments of the present invention aim to provide a bearing fault diagnosis method based on operating condition decoupling, in order to solve the problem that the accuracy and reliability of existing fault diagnosis methods are not high.

[0006] The objective of this invention is mainly achieved through the following technical solutions:

[0007] This invention provides a bearing fault diagnosis method based on operating condition decoupling, comprising:

[0008] Acquire bearing vibration signal data; wherein, the bearing vibration signal data includes bearing operating condition attributes and bearing vibration data;

[0009] The bearing vibration signal data is input into a trained bearing fault diagnosis model to obtain the bearing fault category; wherein, the bearing fault diagnosis model includes a data input layer, a feature extraction layer, and a fault classifier tail.

[0010] Furthermore, the bearing vibration signal data is input into the trained bearing fault diagnosis model to obtain the bearing fault category, including:

[0011] The bearing vibration signal data is received by the data input layer, and the data features are fused and calculated by the feature extraction layer to extract the features;

[0012] The features extracted by the feature extraction layer are used to output the fault category through the softmax function at the end of the fault classifier.

[0013] Furthermore, the bearing vibration signal data undergoes feature fusion calculation through the feature extraction layer to extract features, including:

[0014] The bearing vibration signal data is processed by the data fusion layer of the feature extraction layer to fuse the bearing operating condition attributes with the bearing vibration data to obtain feature-encoded data.

[0015] The feature-encoded data is scale-segmented and then fused and output after passing through several sequentially connected multi-scale cascaded attention modules of the feature extraction layer.

[0016] Furthermore, the bearing vibration signal data is fused with the bearing operating condition attributes and bearing vibration data through the data fusion layer to obtain feature-coded data, including:

[0017] The bearing vibration signal data is encoded by the bearing condition attribute encoding layer of the data fusion layer to obtain the bearing condition attribute encoding value.

[0018] The bearing vibration signal data is encoded and embedded by the vibration data embedding layer of the data fusion layer to obtain the vibration data embedding value.

[0019] The vibration data encoding value obtained by multiplying the bearing condition attribute encoding value and the vibration data embedding value bitwise is added bitwise to the position embedding layer of the feature extraction layer and the position weight matrix to obtain the feature encoding data. The position weight matrix is ​​calculated by using sine and cosine functions from the position vector in the bearing vibration signal data.

[0020] Furthermore, the feature-encoded data is scale-segmented and then fused and output after passing through several sequentially connected multi-scale cascaded attention modules, including:

[0021] The feature encoding data is divided into two sets of half-scale feature transformation data and one set of original-scale feature transformation data through the multi-scale separation layer of the multi-scale cascaded attention module.

[0022] The first group of half-scale feature transformation data is self-attention calculated in the first head of the cascaded attention module layer of the multi-scale cascaded attention module to obtain the calculation result o1;

[0023] The calculation result o1 and the second group of half-scale feature transformation data are added bitwise in the second header of the cascaded attention module layer of the multi-scale cascaded attention module to perform self-attention calculation, and the calculation result o2 is obtained.

[0024] The calculation result o1 is concatenated with the calculation result o2 to obtain the calculation result o3; the calculation result o3 and the original scale feature transformation data are added bitwise in the third header of the cascaded attention module layer of the multi-scale cascaded attention module to obtain the calculation result o4.

[0025] The calculation results o1 and o2 are concatenated in the splicing mapping layer of the cascaded attention module layer of the multi-scale cascaded attention module, and then added bitwise to the calculation result o4 to obtain the output data of the cascaded attention module layer.

[0026] Furthermore, the bearing fault diagnosis model is trained using the following method:

[0027] Step 1: Obtain bearing vibration signal training data at several time points to construct four time series training datasets of the same length; wherein, the training data includes bearing operating condition attributes, bearing vibration data, and bearing fault category labels;

[0028] Step 2: Construct quadruples based on the four time-series training datasets of the same length;

[0029] Step 3: The data input layer receives the quadruple sample pairs and performs preliminary training on the feature extractor model to obtain the pre-trained feature extractor model; wherein, the feature extractor model includes a feature extraction layer and a feature extractor tail;

[0030] Step 4: The data input layer receives the quadruple sample pairs and simultaneously trains the feature extractor model and bearing fault diagnosis model trained in Step 3 to obtain the trained feature extraction layer; wherein, the bearing fault diagnosis model includes a data input layer, a feature extraction layer shared with the feature extractor model, and a fault classifier tail;

[0031] Step 5: The data input layer receives at least one time series training dataset to train the fault classifier tail of the bearing fault diagnosis model trained in Step 4, and obtains the trained fault classifier tail; based on the data input layer, the feature extraction layer trained in Step 4 and the fault classifier tail trained in Step 5, the trained bearing fault diagnosis model is obtained.

[0032] Furthermore, the acquisition of bearing vibration signal training data at several time points constructs four time-series training datasets of equal length, including:

[0033] (1) Construct a time series training dataset by combining bearing vibration signal training data of the same fault category but different working conditions;

[0034] (2) Construct another set of time series training datasets from the bearing vibration signal training data of the same fault category and the same working condition;

[0035] (3) Construct another set of time series training datasets from the bearing vibration signal training data of different fault categories belonging to different working conditions;

[0036] (4) Construct another set of time series training datasets by combining the bearing vibration signal training data of different fault categories that belong to the same working condition.

[0037] Furthermore, the construction of the quadruple sample pairs includes the following steps:

[0038] S101. Randomly select one of the four time series training datasets as anchor point a;

[0039] S102. Select another time series training dataset of vibration data of the same fault category as anchor point a from the four time series training datasets as positive sample p;

[0040] S103. Select one time series training dataset of vibration data with a different fault category from anchor point a from the four time series training datasets as negative sample n1.

[0041] S104. From the four time series training datasets, select another time series training dataset of vibration data with a different fault category than anchor point a as negative sample n2.

[0042] Furthermore, the data input layer receives the quadruplet sample pairs to perform preliminary training on the feature extractor model, and obtains the pre-trained feature extractor model, including: loading the quadruplet sample pairs, training the feature extractor model using the quadruplet loss function, updating the feature extraction layer parameters and the feature extractor tail parameters using gradient backpropagation, and saving the feature extraction layer parameters and the feature extractor tail parameters after training.

[0043] The data input layer receives the quadruplet sample pairs and trains the feature extractor model and the bearing fault diagnosis model simultaneously, including: loading the quadruplet sample pairs, training the feature extractor model and the bearing fault diagnosis model simultaneously using a weighted combination of the quadruplet loss function and the multi-classification loss function, updating the feature extraction layer parameters, the tail parameters of the feature extractor and the tail parameters of the fault classifier using gradient backpropagation, and saving the feature extraction layer parameters, the tail parameters of the feature extractor and the tail parameters of the fault classifier after training.

[0044] The data input layer receives at least one time-series training dataset to train the fault classifier tail of the bearing fault diagnosis model, including: loading the training dataset, fixing the weight parameters of the feature extraction layer, training the fault classifier tail of the bearing fault diagnosis model using a multi-class loss function, updating the fault classifier tail parameters using gradient backpropagation, saving the fault classifier tail parameters after training, and obtaining the final bearing fault diagnosis model based on the data input layer, the feature extraction layer, and the fault classifier tail.

[0045] Furthermore, the quadruple loss function is calculated using the following formula:

[0046] L q =(d a,p -d a,n1 +α) + +(d a,p -d n1,n2 +β) +

[0047] Where, d a,p d is the distance between anchor point a and positive sample p in the quadruple sample pair; a , n1 d is the distance between anchor point a and negative sample n1 in the quadruple sample pair; n1,n2 The distance between negative sample n1 and negative sample n2 in the quadruple sample pair; α and β are hyperparameters;

[0048] The multi-class loss function includes the softmax activation function and the cross-entropy loss function.

[0049] Compared with the prior art, the present invention can achieve at least one of the following beneficial effects:

[0050] 1. The technical solution of this invention can integrate operating condition information into vibration data. By introducing an operating condition attribute encoding layer, it achieves the effect of operating condition decoupling. This reduces interference from vibration data under varying operating conditions, allowing the model to focus more on fault-related features in the vibration data. Through operating condition decoupling, the fault diagnosis method can more accurately extract and identify fault features, improving the accuracy and reliability of fault diagnosis.

[0051] 2. The technical solution of this invention improves the multi-head attention module in the Transformer structure and introduces a multi-scale cascaded attention module layer. This enables the fault diagnosis method to better extract key features. The multi-scale cascaded attention module layer, through a multi-head attention mechanism and cascaded design, can capture features at different scales and achieve multi-level interaction and information transmission during feature extraction. This enhanced feature extraction capability helps to more accurately distinguish different fault categories and improve the performance of fault diagnosis.

[0052] 3. The technical solution of this invention introduces a quadruples loss function for training the fault diagnosis model. Through self-supervised learning, the domain-invariant representation of the samples is learned by comparing intra-class and inter-class distances. This improves the discriminative power of the features, enabling the fault diagnosis model to better distinguish between different categories of vibration data.

[0053] 4. The technical solution of this invention employs a three-stage training process in the fault diagnosis model training. The feature extraction model and the fault diagnosis model are trained separately, gradually improving the feature representation capability and fault diagnosis performance of the fault diagnosis model. Through this training method, the fault diagnosis model can better learn the characteristics of vibration data and apply them to accurate fault classification.

[0054] In this invention, the above-described technical solutions can be combined with each other to achieve more preferred combinations. Other features and advantages of this invention will be set forth in the following description, and some advantages may become apparent from the description or be learned by practicing the invention. The objects and other advantages of this invention can be realized and obtained from what is particularly pointed out in the description and drawings. Attached Figure Description

[0055] The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Throughout the drawings, the same reference numerals denote the same parts.

[0056] Figure 1 This is a flowchart illustrating a bearing fault diagnosis method based on operating condition decoupling in an embodiment of the present invention.

[0057] Figure 2 This is a schematic diagram of a method for constructing a bearing fault diagnosis model according to an embodiment of the present invention;

[0058] Figure 3 This is a schematic diagram of the bearing fault diagnosis model in an embodiment of the present invention. Detailed Implementation

[0059] Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form part of this application and are used together with the embodiments of the present invention to illustrate the principles of the present invention, but are not intended to limit the scope of the present invention.

[0060] A specific embodiment of the present invention discloses a bearing fault diagnosis method based on operating condition decoupling, such as... Figure 1 As shown, it includes the following steps:

[0061] Step S1: Obtain bearing vibration signal data; wherein, the bearing vibration signal data includes bearing operating condition attributes and bearing vibration data.

[0062] Bearing operating condition attributes include normal operation, speed-up operation, and operation.

[0063] Bearing vibration data refers to bearing vibration acceleration data.

[0064] Step S2: Input the bearing vibration signal data into the trained bearing fault diagnosis model to obtain the bearing fault category; wherein, the bearing fault diagnosis model includes a data input layer, a feature extraction layer, and a fault classifier tail.

[0065] Furthermore, the bearing vibration signal data is input into the trained bearing fault diagnosis model to obtain the bearing fault category, including:

[0066] The bearing vibration signal data is received by the data input layer, and the data features are fused and calculated by the feature extraction layer to extract features. The feature extraction layer includes a data fusion layer and a multi-scale cascaded attention module.

[0067] The features extracted by the feature extraction layer are used to output the fault category through the softmax function at the end of the fault classifier.

[0068] Specifically, the bearing vibration signal data undergoes feature fusion calculation through the feature extraction layer to extract features, including:

[0069] The bearing vibration signal data is processed by the data fusion layer of the feature extraction layer to obtain feature-encoded data.

[0070] Specifically, the data fusion layer includes a working condition attribute encoding layer, a vibration data embedding layer, and a location embedding layer.

[0071] The bearing vibration signal data is encoded by the bearing condition attribute encoding layer of the data fusion layer to obtain the bearing condition attribute encoding value.

[0072] Specifically, the operating condition attribute encoding layer includes a fully connected layer and an activation layer, which are used to encode the bearing operating condition attributes in the bearing vibration signal data to obtain the bearing operating condition attribute encoded values; by learning the representation of the operating condition attributes, the model can better understand the changes in vibration data under different operating conditions.

[0073] The bearing vibration signal data is encoded and embedded into the vibration data embedding layer of the data fusion layer to obtain the vibration data embedding value.

[0074] Specifically, the vibration data embedding layer includes a fully connected layer and an activation layer, which are used to encode and embed the bearing vibration data in the bearing vibration signal data to obtain the vibration data embedding value; this can transform the original vibration data into a higher-dimensional representation, which helps the model extract richer features.

[0075] The vibration data encoding value obtained by multiplying the bearing condition attribute encoding value and the vibration data embedding value bitwise is added bitwise to the position embedding layer of the feature extraction layer and the position weight matrix to obtain the feature encoding data. The position weight matrix is ​​calculated by the position vector in the bearing vibration signal data through sine and cosine functions.

[0076] Specifically, the location embedding layer is the location embedding layer of the Transformer model. This process helps the model capture the temporal information of the vibration data and preserves the location information during feature extraction.

[0077] It should be noted that the position weight matrix of the Transformer model The calculation process is as follows:

[0078]

[0079]

[0080] Where PE is the position vector; pos is the position of the pos-th time point in the time series, starting from 0; d model Let be the dimension of the embedded values ​​of the vibration data; i is the vector dimension, i∈[0,d] model / 2].

[0081] The feature-encoded data is processed by a multi-scale segmentation layer and a cascaded attention module layer of several sequentially connected multi-scale cascaded attention modules of the feature extraction layer to obtain the output data of the feature extraction layer.

[0082] Specifically, the feature-encoded data is divided into two sets of half-scale feature transformation data and one set of original-scale feature transformation data through a multi-scale separation layer;

[0083] Furthermore, the multi-scale segmentation layer includes a combination of convolutional layers, activation layers, and pooling layers. Segmenting transformed data at different scales can help the model extract features at different levels and achieve non-linear transformation of the data.

[0084] Furthermore, the cascaded attention module layer includes a three-head attention module and a splicing mapping layer. The first group of half-scale feature transformation data undergoes self-attention calculation in the first head of the cascaded attention module layer to obtain the calculation result o1.

[0085] The calculation result o1 and the second group of half-scale feature transformation data are added bitwise in the second header of the cascaded attention module layer, and then self-attention calculation is performed to obtain the calculation result o2;

[0086] The calculation result o1 is concatenated with the calculation result o2 to obtain the calculation result o3; the calculation result o3 is added bitwise with the original scale feature transformation data in the third header of the cascaded attention module layer, and then self-attention calculation is performed to obtain the calculation result o4.

[0087] The calculation result o1 and the calculation result o2 are concatenated in the concatenation mapping layer of the cascaded attention module layer, and then added bitwise with the calculation result o4 to obtain the output data of the cascaded attention module layer.

[0088] For example, first, let the size of the input data be (3, 4096), where 3 represents that there are 3 time points in the input data, and 4096 represents the data dimension of each time point.

[0089] Then, as shown in the figure, three calculations are performed simultaneously on the data in the multi-scale segmentation layer, using different convolution kernels and pooling kernels to obtain three calculation results. The data size of the first two results is (3, 2048), and the size of the third result is (3, 4096). The three calculation results are represented by x1, x2, and x3.

[0090] Next, x1, x2, and x3 are calculated in the cascaded attention module layer according to the calculation process in the model structure diagram. x1 is calculated using self-attention to obtain result o1, with size (3, 2048); o1 and x1 are added together at the same element positions, and then calculated using self-attention to obtain o2, with size (3, 2048); then o1 and o2 are concatenated to obtain data of size (3, 4096), which is then added element-wise to x3, and finally calculated using self-attention to obtain o3, with size (3, 4096).

[0091] Finally, o1 is concatenated with o2 and o3 and the elements are added together to obtain the final output o, with a size of (3, 4096).

[0092] It's worth noting that by cascading the attention modules, the output of one head attention module is fused with the input of the next, enabling interaction and information transfer between data at different scales. This cascading design helps the model better capture multi-scale features and improves its feature extraction capabilities.

[0093] Furthermore, the bearing fault diagnosis model is achieved through methods such as... Figure 2 The method shown in the figure was used to train and obtain:

[0094] Specifically, such as Figure 3 As shown, the feature extractor model includes a feature extraction layer and a feature extractor tail; the bearing fault diagnosis model includes a data input layer, a feature extraction layer shared with the feature extractor model, and a fault classifier tail.

[0095] Step 1: Obtain bearing vibration signal training data at several time points to construct four time series training datasets of the same length; wherein, the training data includes bearing operating condition attributes, bearing vibration data, and bearing fault category labels;

[0096] Specifically, to improve model training effectiveness, vibration data needs to be organized according to certain rules to construct the quadruplet sample pairs used for model training. To ensure the diversity and coverage of the sample pairs, the time series training dataset is constructed according to the following rules:

[0097] (1) Construct a time series training dataset by combining bearing vibration signal training data of the same fault category but different working conditions;

[0098] Specifically, in fault diagnosis, transformers may exhibit different vibration characteristics under different operating conditions. Therefore, to enable the model to generalize, it is necessary to ensure that data samples of the same category are covered under different operating conditions. By introducing different operating conditions, the model can learn the commonalities and differences of vibration data of a specific category, thereby improving fault classification.

[0099] (2) Construct another set of time series training datasets from the bearing vibration signal training data of the same fault category and the same working condition;

[0100] Specifically, the model can learn the consistent characteristics of vibration data of the same category under the same operating conditions. By constructing a sample set, the model can capture the variation patterns of vibration data of the same category under the same operating conditions, further improving the accuracy of fault diagnosis.

[0101] (3) Construct another set of time series training datasets from the bearing vibration signal training data of different fault categories belonging to different working conditions;

[0102] Specifically, the model needs to learn the differences between different categories of vibration data under various operating conditions. By combining different categories of data with different operating conditions, the model can distinguish the characteristic differences between different fault categories, thereby achieving accurate fault classification.

[0103] (4) Construct another set of time series training datasets from the bearing vibration signal training data of different fault categories belonging to the same working condition;

[0104] Specifically, vibration data from different fault categories may exhibit similar characteristics under the same operating conditions. Therefore, in order for the model to accurately identify these similar characteristics, it is necessary to construct data for different fault categories, ensuring that they all originate from the same operating conditions. This allows the model to learn to distinguish subtle differences between different fault categories, thereby improving the robustness of fault diagnosis.

[0105] Step 2: Construct quadruples based on the four time-series training datasets of the same length;

[0106] Specifically, the purpose of constructing quadruplet sample pairs is to perform contrastive learning, thereby bringing similar time series closer together in the embedding space and dissimilar time series further apart. This contrastive learning method performs well in tasks such as anomaly detection and similarity matching of vibration data.

[0107] The basic principle for constructing quadruple sample pairs is that each pair consists of four time-series vibration data points, including three categories: Anchor, Positive, Negative1, and Negative2. Anchor and Positive belong to the same fault category, while Anchor, Negative1, and Negative2 represent data from different fault categories. Constructing sample pairs helps the model learn the characteristics of vibration data and enables comparison and classification during fault diagnosis. This allows the model to learn the similarities and differences between time series, thus performing better in tasks related to vibration data.

[0108] Specifically, the construction includes the following steps:

[0109] S101. Randomly select one of the four time series training datasets as anchor point a;

[0110] It should be noted that the anchor point is the basis for constructing sample pairs, and all sample pairs are based on anchor point a.

[0111] S102. Select another time series training dataset of vibration data of the same fault category as anchor point a from the four time series training datasets as positive sample p;

[0112] It should be noted that positive sample p and anchor point a belong to the same fault category or have similar vibration characteristics.

[0113] S103. Select one time series training dataset of vibration data with a different fault category from anchor point a from the four time series training datasets as negative sample n1.

[0114] It should be noted that the negative sample n1 and the anchor point a come from different fault categories, which are used to help the bearing fault diagnosis model distinguish the differences between different categories.

[0115] S104. From the four time series training datasets, select another time series training dataset of vibration data with a different fault category than anchor point a as negative sample n2.

[0116] It should be noted that the negative sample n2, like the anchor point a and the negative sample n1, comes from a different fault category and is also used to increase the model's ability to distinguish between different fault categories.

[0117] Step 3: The data input layer receives the quadruple sample pairs and performs preliminary training on the feature extractor model to obtain the pre-trained feature extractor model.

[0118] Specifically, the quadruplet sample pairs are loaded, the feature extractor model is trained using the quadruplet loss function, the parameters of the feature extraction layer and the tail parameters of the feature extractor are updated using gradient backpropagation, and the parameters of the feature extraction layer and the tail parameters of the feature extractor are saved after training. The purpose of this step is to enable the feature extractor to extract features that are independent of the operating conditions from the vibration data.

[0119] Furthermore, the goal of the quadruplet loss function is to achieve better feature discriminativeness by minimizing the feature distance between similar sample pairs and maximizing the feature distance between dissimilar sample pairs. This allows the model to better cluster similar samples together and separate dissimilar samples in the feature space. By introducing the quadruplet loss function, the tail of the feature extractor can improve the discriminativeness of features, thereby providing more accurate and reliable feature representations for subsequent fault identification.

[0120] Furthermore, the quadruplet loss function utilizes similarity learning to improve the model's ability to extract domain-invariant features. The quadruplet loss considers both intra-class and inter-class distances of samples, and its calculation formula is as follows:

[0121] L q =(d a,p -d a,n1 +α) + +(d a,p -d n1,n2 +β) +

[0122] Where, d a,p d is the distance between anchor point a and positive sample p in the quadruple sample pair; a,n1 d is the distance between anchor point a and negative sample n1 in the quadruple sample pair; n1,n2 The distance between negative sample n1 and negative sample n2 in the quadruple sample pair; α and β are hyperparameters;

[0123] It should be noted that (d a,p -d a,n1 +α) + The distance difference between positive and negative sample pairs is measured, aiming for the distance between the positive sample and the anchor point to be less than the distance between the negative sample and the anchor point, and the difference between them to be greater than α. (d) a,p -d n1,n2 +β) + It measures the relative distance between positive and negative samples, making the distance between positive and negative sample pairs larger while maintaining the distance difference between positive and negative sample pairs.

[0124] This loss function can facilitate closer proximity between positive samples of the same class and the anchor point, while simultaneously increasing the distance between positive and negative samples, thereby enhancing sample classification performance in metric learning. The hyperparameters α and β can be adjusted to control the distance difference between positive and negative samples, as well as the relative distance between pairs of positive and negative samples, thus achieving better model training and feature representation learning.

[0125] Step 4: The data input layer receives the quadruple sample pairs and simultaneously trains the feature extractor model and bearing fault diagnosis model that were initially trained in Step 3 to obtain the trained feature extraction layer.

[0126] Specifically, the quadruplet sample pairs are loaded, and the feature extractor model and the bearing fault diagnosis model are trained simultaneously using a weighted combination of the quadruplet loss function and the multi-classification loss function. The parameters of the feature extraction layer, the tail parameters of the feature extractor, and the tail parameters of the fault classifier are updated using gradient backpropagation. After training, the parameters of the feature extraction layer, the tail parameters of the feature extractor, and the tail parameters of the fault classifier are saved.

[0127] Furthermore, at the end of the fault classifier, the fault category of the sample is predicted by calculating multi-class loss on the features extracted by the feature extraction layer. After outputting the probability value of each category through the Softmax function, the loss is calculated using cross-entropy.

[0128] It should be noted that the goal of the multi-class loss function is to enable the model to accurately map the input features to the corresponding fault categories. By minimizing the classification error, the model can learn the feature differences between different fault categories and achieve accurate fault identification.

[0129] Specifically, the design of the classifier tail combines feature extraction and classification tasks to achieve end-to-end fault diagnosis. Through the collaborative work of the feature extraction layer and the classifier tail, the model can transform vibration data into meaningful feature representations and perform accurate fault classification.

[0130] Step 5: The data input layer receives at least one time series training dataset to train the fault classifier tail of the bearing fault diagnosis model trained in Step 4, and obtains the trained fault classifier tail; based on the data input layer, the feature extraction layer trained in Step 4 and the fault classifier tail trained in Step 5, the trained bearing fault diagnosis model is obtained.

[0131] Specifically, the training dataset is loaded, the weight parameters of the feature extraction layer are fixed, the tail of the fault classifier of the bearing fault diagnosis model is trained using a multi-class loss function, the tail parameters of the fault classifier are updated using gradient backpropagation, and the tail parameters of the fault classifier are saved after training.

[0132] The final bearing fault diagnosis model is obtained based on the data input layer, the feature extraction layer, and the fault classifier tail.

[0133] Introducing operating condition information allows it to be integrated into vibration data, achieving operating condition decoupling. This reduces interference from vibration data under varying operating conditions, enabling the model to focus more on fault-related features. Through operating condition decoupling, the model can more accurately extract and identify fault features, improving the accuracy and reliability of fault diagnosis.

[0134] Meanwhile, the improved Transformer architecture incorporates a multi-head attention module and introduces a multi-scale cascaded attention module layer, enabling the model to better extract key features. The multi-scale cascaded attention module layer, through multi-head attention mechanisms and cascaded design, captures features at different scales and achieves multi-level interaction and information transfer during feature extraction, helping to more accurately distinguish different fault categories and improve fault diagnosis performance.

[0135] During the training of the bearing fault diagnosis model, a four-tuple loss function is introduced to improve the discriminative power of features, enabling the model to better distinguish different categories of vibration data. The three-stage training process trains the feature extraction model and the fault diagnosis model separately, gradually improving the model's feature representation ability and fault diagnosis performance. Through this training method, the model can better learn the features of vibration data and apply them to accurate fault classification, thereby improving the accuracy and reliability of bearing fault diagnosis.

[0136] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the scope of protection of the present invention.

Claims

1. A bearing fault diagnosis method based on operating condition decoupling, characterized in that, include: Acquire bearing vibration signal data; wherein, the bearing vibration signal data includes bearing operating condition attributes and bearing vibration data; The bearing vibration signal data is input into a trained bearing fault diagnosis model to obtain the bearing fault category; wherein, the bearing fault diagnosis model includes a data input layer, a feature extraction layer, and a fault classifier tail; The bearing fault diagnosis model was trained using the following method: Step 1: Obtain bearing vibration signal training data at several time points to construct four time series training datasets of the same length; wherein, the training data includes bearing operating condition attributes, bearing vibration data, and bearing fault category labels; Step 2: Construct quadruples based on the four time-series training datasets of the same length; Step 3: The data input layer receives the quadruplet sample pairs and performs preliminary training on the feature extractor model to obtain the pre-trained feature extractor model; wherein, the feature extractor model includes a feature extraction layer and a feature extractor tail; obtaining the pre-trained feature extractor model includes: loading the quadruplet sample pairs, training the feature extractor model using the quadruplet loss function, updating the parameters of the feature extraction layer and the parameters of the feature extractor tail using gradient backpropagation, and saving the parameters of the feature extraction layer and the parameters of the feature extractor tail after training; Step 4: The data input layer receives the quadruple sample pairs and simultaneously trains the feature extractor model and bearing fault diagnosis model initially trained in Step 3 to obtain a trained feature extraction layer. The bearing fault diagnosis model includes a data input layer, a feature extraction layer shared with the feature extractor model, and a fault classifier tail. Obtaining the trained feature extraction layer includes: loading the quadruple sample pairs; simultaneously training the feature extractor model and the bearing fault diagnosis model using a weighted combination of the quadruple loss function and the multi-classification loss function; updating the feature extraction layer parameters, the feature extractor tail parameters, and the fault classifier tail parameters using gradient backpropagation; and saving the feature extraction layer parameters, the feature extractor tail parameters, and the fault classifier tail parameters after training. Step 5: The data input layer receives at least one time-series training dataset to train the fault classifier tail of the bearing fault diagnosis model trained in Step 4, obtaining a trained fault classifier tail; based on the data input layer, the feature extraction layer trained in Step 4, and the fault classifier tail trained in Step 5, a trained bearing fault diagnosis model is obtained; obtaining the trained fault classifier tail includes: loading the training dataset, fixing the weight parameters of the feature extraction layer, training the fault classifier tail of the bearing fault diagnosis model using a multi-class loss function, updating the fault classifier tail parameters using gradient backpropagation, saving the fault classifier tail parameters after training, and obtaining the final bearing fault diagnosis model based on the data input layer, the feature extraction layer, and the fault classifier tail.

2. The method according to claim 1, characterized in that, The bearing vibration signal data is input into the trained bearing fault diagnosis model to obtain the bearing fault categories, including: The bearing vibration signal data is received by the data input layer, and the data features are fused and calculated by the feature extraction layer to extract the features; The features extracted by the feature extraction layer are used to output the fault category through the softmax function at the end of the fault classifier.

3. The method according to claim 2, characterized in that, The bearing vibration signal data is processed by the feature extraction layer to fuse and calculate data features, including: The bearing vibration signal data is processed by the data fusion layer of the feature extraction layer to fuse the bearing operating condition attributes with the bearing vibration data to obtain feature-encoded data. The feature-encoded data is scale-segmented and then fused and output after passing through several sequentially connected multi-scale cascaded attention modules of the feature extraction layer.

4. The method according to claim 3, characterized in that, The bearing vibration signal data is fused with the bearing operating condition attributes and bearing vibration data through the data fusion layer to obtain feature-coded data, including: The bearing vibration signal data is encoded by the bearing condition attribute encoding layer of the data fusion layer to obtain the bearing condition attribute encoding value. The bearing vibration signal data is encoded and embedded by the vibration data embedding layer of the data fusion layer to obtain the vibration data embedding value. The vibration data encoding value obtained by multiplying the bearing condition attribute encoding value and the vibration data embedding value bitwise is added bitwise to the position embedding layer of the feature extraction layer and the position weight matrix to obtain the feature encoding data. The position weight matrix is ​​calculated by using sine and cosine functions from the position vector in the bearing vibration signal data.

5. The method according to any one of claims 3 or 4, characterized in that, The feature-encoded data is scale-segmented and then fused and output after passing through several sequentially connected multi-scale cascaded attention modules, including: The feature encoding data is divided into two sets of half-scale feature transformation data and one set of original-scale feature transformation data through the multi-scale separation layer of the multi-scale cascaded attention module. The first group of half-scale feature transformation data is self-attention calculated in the first head of the cascaded attention module layer of the multi-scale cascaded attention module to obtain the calculation result o1; The calculation result o1 and the second group of half-scale feature transformation data are added bitwise in the second header of the cascaded attention module layer of the multi-scale cascaded attention module to perform self-attention calculation, and the calculation result o2 is obtained. The calculation result o1 is concatenated with the calculation result o2 to obtain the calculation result o3; the calculation result o3 and the original scale feature transformation data are added bitwise in the third header of the cascaded attention module layer of the multi-scale cascaded attention module to obtain the calculation result o4. The calculation results o1 and o2 are concatenated in the splicing mapping layer of the cascaded attention module layer of the multi-scale cascaded attention module, and then added bitwise to the calculation result o4 to obtain the output data of the cascaded attention module layer.

6. The method according to claim 1, characterized in that, The acquisition of bearing vibration signal training data at several time points constructs four time series training datasets of equal length, including: (1) Construct a time series training dataset by combining bearing vibration signal training data of the same fault category under different working conditions; (2) Construct another set of time series training datasets from the bearing vibration signal training data of the same fault category and the same working condition; (3) Construct another set of time series training datasets from the bearing vibration signal training data of different fault categories belonging to different working conditions; (4) Construct another set of time series training datasets from the bearing vibration signal training data of different fault categories belonging to the same working condition.

7. The method according to claim 6, characterized in that, The construction of quadruple sample pairs includes the following steps: S101. Randomly select one of the four time series training datasets as anchor point a; S102. Select another time series training dataset of vibration data of the same fault category as anchor point a from the four time series training datasets as positive sample p; S103. Select one time series training dataset of vibration data with a different fault category from anchor point a from the four time series training datasets as negative sample n1. S104. From the four time series training datasets, select another time series training dataset of vibration data with a different fault category than anchor point a as negative sample n2.

8. The method according to claim 7, characterized in that, The formula for calculating the quadruple loss function is as follows: in, The distance between anchor point a and positive sample p in the quadruple sample pair; The distance between anchor point a and negative sample n1 in the quadruplet sample pair; The distance between negative sample n1 and negative sample n2 in the quadruplet sample pair; , For hyperparameters; The multi-class loss function includes the softmax activation function and the cross-entropy loss function.