A multi-modal bearing fault intelligent diagnosis method based on transfer learning

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By employing transfer learning in rolling bearing fault diagnosis, integrating time-domain and frequency-domain signal features, and dynamically adjusting the distribution adaptation, the problem of insufficient model generalization is solved, and high-precision cross-operating-condition fault diagnosis is achieved.

CN116401603BActive Publication Date: 2026-06-16HARBIN ENG UNIV

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HARBIN ENG UNIV
Filing Date: 2023-04-26
Publication Date: 2026-06-16

Application Information

Patent Timeline

26 Apr 2023

Application

16 Jun 2026

Publication

CN116401603B

IPC: G06F18/2415; G06F18/10; G06F18/213; G06F18/214; G06F18/21; G06F18/25; G06N3/0455; G06N3/047; G06N3/096; G01M13/045

CPC: G06F18/2415; G06F18/10; G06F18/213; G06F18/214; G06F18/217; G06F18/253; G06N3/0455; G06N3/047

AI Tagging

Application Domain

Machine part testing Biological models

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing deep learning models lack generalization ability in rolling bearing fault diagnosis, making it difficult to effectively diagnose across different operating conditions, resulting in decreased diagnostic accuracy.

Method used

A multimodal intelligent bearing fault diagnosis method based on transfer learning is adopted. The time domain and frequency domain signal features are fused through the attention mechanism, and the edge and condition distribution of the source domain and target domain are adjusted by the dynamic joint distribution adaptive module to enhance the generalization performance of the model under cross-working conditions.

Benefits of technology

It achieves high-precision fault diagnosis under different working conditions, and improves the model's cross-domain adaptability and diagnostic accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116401603B_ABST

Patent Text Reader

Abstract

The application belongs to the field of mechanical equipment fault diagnosis, and particularly relates to a kind of multi-modal bearing fault intelligent diagnosis method based on transfer learning, including collecting original vibration signals under different working conditions as source domain data and target domain data respectively, obtaining time domain and frequency domain two observation angle information through preprocessing operation as the multi-modal input of model;Deep transfer network model is constructed, multi-angle representation features of homologous data are deeply mined through multi-modal information fusion network based on attention mechanism, the diagnostic performance of fault category is ensured through label classifier and source domain marked data, the edge distribution and conditional distribution of source domain and target domain data are respectively adapted through domain discriminator and subclass measurement module, the weight of two distributions in the transfer process is dynamically adjusted, and finally dynamic joint distribution self-adaption is formed.Find domain invariant features to improve the generalization ability of the model on target domain data and improve the cross-domain fault intelligent diagnosis precision of mechanical equipment.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of mechanical equipment fault diagnosis, specifically relating to a multimodal intelligent bearing fault diagnosis method based on transfer learning. Background Technology

[0002] Industrial systems are gradually moving towards intelligent manufacturing, and various mechanical equipment are becoming more automated and complex. As a key component of the transmission device of large industrial equipment, rolling bearings are closely related to the normal operation of the equipment. Intelligent research on bearing fault diagnosis is of great significance for ensuring production and avoiding accidents.

[0003] Currently, research on mechanical fault diagnosis largely focuses on signal feature selection and diagnostic classification. Excessive vibration is often the main factor leading to transmission device failures, and modeling and analyzing vibration signals collected by sensors has become one of the primary bases for operational status diagnosis. In signal analysis and processing-based diagnostic methods, the time domain and frequency domain are two perspectives for observing the implicit patterns within the raw vibration data, each with different sensitivities to fault modes. Time domain features reflect the change of signal amplitude over time, while frequency domain features study the distribution of signal energy across different frequency bands. For fault diagnosis tasks, multimodal information from the same data can uncover more signal features.

[0004] Deep learning models achieve good diagnostic results based on the premise that the training and test sets come from the same distribution. However, rotating machinery such as rolling bearings has certain unique characteristics. In actual industrial scenarios, the diversity of working conditions and equipment models often leads to training and test sets coming from different distributions. Furthermore, collecting sufficient labeled data to support deep learning model training under all conditions is challenging. In such cases, insufficient model generalization results in decreased diagnostic accuracy. Therefore, improving fault feature extraction capabilities and reducing the differences in feature distribution caused by factors such as working conditions, enabling effective cross-domain diagnosis across different working conditions, are two key challenges facing intelligent fault diagnosis. Summary of the Invention

[0005] The problem this invention aims to solve is: to comprehensively reflect the information of vibration signals and complete fault diagnosis tasks across operating conditions, a multimodal intelligent bearing fault diagnosis method based on transfer learning is proposed, applying the concept of transfer learning to bearing fault diagnosis. This method, on the one hand, fuses information from the time and frequency domains of the vibration signal based on an attention mechanism to comprehensively extract features; on the other hand, it dynamically adapts the edge and conditional distributions of the source and target domains to enhance the model's generalization performance in cross-operating condition scenarios.

[0006] To address the aforementioned issues, a multimodal intelligent bearing fault diagnosis method based on transfer learning is provided. The implementation method includes the following steps:

[0007] Step 1: Collect raw data and preprocess it:

[0008] Raw vibration signals under different working conditions were collected as source domain data and target domain data, respectively. The dataset was expanded by sliding window, and two-dimensional waveforms were plotted from the two observation angles in the time domain and frequency domain. Then, the source domain samples and target domain samples were divided into training set and test set in an 8:2 ratio.

[0009] Step 2: Construct a deep transfer learning network model:

[0010] The deep transfer learning network model consists of three main parts: a feature extraction module, a dynamic joint distribution adaptive module, and a classification module. The feature extraction module F comprises two operations: feature extraction and multimodal information fusion. Building upon the self-attention mechanism to focus on the global temporal characteristics of vibration signals, it further uses a cross-attention mechanism to fuse multimodal information from the same source data in the time and frequency domains at the feature level, fully extracting the multi-angle features described in step 1. The classification module C uses a fully connected layer to supervise the training of the source domain data from step 1, obtaining the predicted classification result through the Softmax function, and thus the classification loss. The dynamic joint distribution adaptive module includes a domain discriminator D. w and subclass measurement module, D w Based on the adversarial concept, the marginal probability distributions of global data in the source and target domains are implicitly aligned to obtain the global domain adversarial loss; the subclass metric module uses the Local Maximum Mean Difference (LMMD) algorithm to calculate the distance between data within each class and explicitly calculates the conditional probability distribution of the data to obtain the local difference loss; according to the data characteristics of different domains, a balance factor μ is introduced to assign different weights to marginal adaptation and conditional adaptation, forming a dynamic joint distribution adaptive method;

[0011] Step 3: Train the deep transfer network model:

[0012] The training data of the source domain and target domain in step 1 are input into the deep transfer network constructed in step 2. The sum of the classification loss, global adversarial loss and local difference loss in step 2 is used as the final loss of the network. The gradient descent algorithm is used for iterative training to optimize the parameters and obtain the trained transfer diagnostic model.

[0013] Step 4: Test the deep transfer network model:

[0014] Input the test set data of the target domain in step 1 into the deep transfer diagnostic model trained in step 3 to test the diagnostic performance of the model.

[0015] In step 1, the original vibration signal is the time domain data, and the frequency domain data is obtained by performing a fast Fourier transform on the original data, which in turn draws a two-dimensional image.

[0016] In step 2, the feature extraction module F uses Vision Transformer as the backbone network for feature extraction and cross-attention mechanism as the core component for multimodal fusion to enhance the correlation between key information; the domain discriminator D... w WGAN-GP is used to avoid the gradient vanishing problem and to fit the optimal Wasserstein distance between the source and target domains, which helps the feature extraction module F learn domain-invariant features.

[0017] The classification loss L class The cross-entropy loss function is used to calculate the loss between the true labels of the source domain data samples and the predicted labels after passing through the deep transfer network. The expression is as follows:

[0018] In the formula, N and C represent the number of samples and the number of categories, respectively;

[0019] The global domain adversarial loss L margin The loss function used is expressed as follows:

[0020] In the formula x s and x t L represents the source domain data and the target domain data, respectively. wd Domain discriminator D w The loss, L grad Indicates gradient penalty;

[0021] L wd The expression is:

[0022] L grad The expression is:

[0023] In the above formula, n s n t y represents the number of samples in the source and target domains, respectively. i This represents the domain label; the domain label for the source domain data is set to 0, and the domain label for the target domain data is set to 1.

[0024] The local difference loss L lmmd The loss function used is expressed as follows:

[0025]

[0026] In the above formula, w represents the weight of each sample belonging to class c, and k represents the kernel function that maps the data to a high-dimensional space;

[0027] The expression for the balance factor μ is: In the formula d MTo represent the difference between the source and target domains, the labels for source domain features are defined as 0, and the labels for target domain features are defined as 1. An SVM classifier using a linear kernel function is constructed to distinguish whether the input sample comes from the source or target domain. Let d be an example of this. M d represents the probability that the classifier correctly determines the outcome. C Indicates the difference in the conditional domain, d C =LMMD(D s D t );

[0028] The overall optimization goal of deep migration networks is:

[0029]

[0030] In the above formula, L class L represents the classification loss. margin L represents the global domain adversarial loss. lmmd This represents the local difference loss, where (1-μ) and μ represent the weights of each part, respectively, and θ f θ dw and θ c These represent the network parameters for the feature extraction module, the domain discriminator, and the classification module, respectively.

[0031] In step 3, the processed training data of the source domain and the target domain are simultaneously input into the deep transfer network. During training, both the source domain and the target domain data need to pass through the feature extraction module F to obtain fused features. Therefore, the feature extraction network structures and weights of the two are consistent. By optimizing the target in step 2 to dynamically align the marginal probability distribution and conditional probability distribution of the source domain and the target domain data, the transfer effect is enhanced and the model's generalization ability to the target domain data is improved.

[0032] The beneficial effects of this invention are:

[0033] (1) In order to make full use of the multi-angle characterization information of vibration signals, the method of the present invention takes the time domain and frequency domain as multi-modal information input, focuses on the temporal information in vibration signals through the self-attention mechanism in VisionTransformer, and further uses the cross-attention mechanism to perform feature fusion on multiple modes in order to fully mine the effective fault information contained in the signal.

[0034] (2) The method of this invention uses implicit adversarial thinking and explicit difference measurement thinking to simultaneously complete the edge distribution adaptation and conditional distribution adaptation of the source domain and the target domain. Furthermore, during the migration process, the contribution degree of joint adaptation is dynamically adjusted according to the characteristics of the data domain, forming a cross-domain fault diagnosis method based on dynamic joint distribution adaptation, which can better learn fault diagnosis knowledge and effectively complete the cross-domain fault diagnosis task of bearings. Attached Figure Description

[0035] Figure 1 This is a diagram showing the overall structure of the deep transfer network model constructed in this invention;

[0036] Figure 2 This is a structural diagram of the feature extraction module in the deep transfer network model constructed in this invention;

[0037] Figure 3 This is the t-sne feature dimensionality reduction distribution diagram of DANN on task D→A in this embodiment of the invention;

[0038] Figure 4 This is the t-sne feature dimensionality reduction distribution diagram of DAN on task D→A in this embodiment of the invention;

[0039] Figure 5 This is a t-sne feature dimensionality reduction distribution diagram of task D→A using the method of the present invention in this embodiment of the invention. Detailed Implementation

[0040] The present invention will be further explained and described below with reference to the accompanying drawings and specific embodiments.

[0041] To verify the feasibility of the proposed model, experiments were conducted on the bearing failure dataset from Case Western Reserve University (CWRU).

[0042] Example: This invention provides a multimodal intelligent bearing fault diagnosis method based on transfer learning, comprising the following steps:

[0043] Step 1: Collect raw data and preprocess it:

[0044] Raw vibration signals under different operating conditions were collected as source domain data and target domain data, respectively. In this example, the CWRU bearing fault dataset was used. The dataset was expanded using a sliding window method, and two-dimensional waveforms were plotted from both the time and frequency domains. Then, the source domain samples and target domain samples were divided into training and test sets in an 8:2 ratio. Vibration data of the drive-end bearing at 12kHz from the CWRU dataset were used. Each of the four operating conditions included one normal state and nine fault states. The detailed data is shown in Table 1 below: Table 1 describes the CWRU data.

[0045]

[0046] Step 2: Construct a deep transfer learning network model:

[0047] like Figure 1 As shown, the deep transfer learning network model consists of three main parts: a feature extraction module, a dynamic joint distribution adaptive module, and a classification module; the structure diagram of the feature extraction module F is shown below. Figure 2As shown, the process consists of two parts: feature extraction and multimodal information fusion. Building upon the self-attention mechanism which focuses on the global temporal characteristics of the vibration signal, a cross-attention mechanism is further used to fuse multimodal information from the same source data in both the temporal and frequency domains at the feature level. This fully extracts the multi-angle features described in step 1. The feature extractor, Vision Transformer, has 12 self-attention heads and 12 encoder stack layers, and the patch size for image segmentation is set to 16, resulting in a mapping dimension of 768. The Transformer encoder depth in the cross-attention mechanism is set to 2, and the number of self-attention heads is 8. The classification module C uses a fully connected layer for supervised training on the source domain data from step 1, obtaining the predicted classification result through the Softmax function, and thus the classification loss. The number of neurons in the fully connected layer is set to 1536, and the number of categories mapped to the dataset through the Softmax function is 10. The dynamic joint distribution adaptive module includes a domain discriminator D. w and subclass measurement module, D w Based on the adversarial approach, the marginal probability distributions of global data in the source and target domains are implicitly aligned to obtain the global adversarial loss, which consists of two fully connected layers with 1536 and 512 neurons respectively, using the ReLU activation function in between. Since the optimal Wasserstein distance needs to be fitted, the output layer has only 1 neuron. The subclass metric module uses the Local Maximum Mean Difference (LMMD) algorithm to calculate the distance between data within each class and explicitly calculates the conditional probability distribution of the data to obtain the local difference loss. According to the data characteristics of different domains, a balancing factor μ is introduced to assign different weights to marginal adaptation and conditional adaptation, forming a dynamic joint distribution adaptive loss.

[0048] Step 3: Train the deep transfer network model:

[0049] The source domain labeled data and the target domain unlabeled data from step 1 are used as the training set to train the network model constructed in step 2. Based on the overall optimization objective of the network in step 2, the gradient descent algorithm is used to iteratively train and optimize the parameters of each part to obtain the trained transfer diagnostic model.

[0050] Step 4: Test the deep transfer network model:

[0051] The remaining unlabeled data in the target domain from step 1 is used as test set data and input into the deep transfer diagnostic model trained in step 3 to test the model's diagnostic performance.

[0052] Experimental verification:

[0053] This experiment aims to verify the cross-domain fault diagnosis accuracy of the present invention under different operating conditions. Different operating condition data from the CWRU dataset were used as the source and target domains, and paired to form 12 transfer tasks, such as A→B, where A is the source domain and B is the target domain. The experiment also compares and analyzes the source-only capability without transfer methods with the capabilities of deep transfer methods DANN and DAN, exploring the superiority of the proposed method. The training batch size was set to 8, the number of epochs to 30, and the learning rate to 0.001. The Adam algorithm was used to optimize the parameters during training. The experimental results are shown in Table 3: Table 3 presents the experimental results of different models on multiple transfer tasks in the CWRU dataset.

[0054]

[0055] Experimental results show that the method proposed in this invention achieved an average diagnostic accuracy of 96.2% in 12 transfer tasks on the CWRU dataset, and achieved better prediction accuracy than other methods in multiple tasks, indicating that the method proposed in this chapter has good cross-domain diagnostic capabilities.

[0056] To visually demonstrate the diagnostic performance of the proposed method, taking the task D→A as an example, t-SNE is used to visualize the feature dimensionality reduction of the latter three methods in the fully connected layer. The feature dimensionality reduction of the deep transfer learning method DANN is shown below. Figure 3 As shown, the feature dimensionality reduction of the Deep Transfer Learning (DAN) method is as follows: Figure 4 As shown, the feature dimensionality reduction method of the present invention is as follows: Figure 5 As shown, the dots represent source domain features, the crosses represent target domain features, and each color represents a category.

[0057] As shown in the figure, the three models can effectively identify and distinguish source domain features. However, the first two models are not clear enough in distinguishing target domain features for certain categories, such as blue, purple, and black, exhibiting some feature overlap and blurred category boundaries. While the method of this invention also has classification errors in the pink and green categories, the boundaries between each category are clearer, and the overlap between source and target domain features is high for each category. This indicates that the method of this invention not only narrows the overall distance between the source and target domains, but also makes the features of the two domains more clustered for the same category, with more obvious inter-class differences, thus achieving better diagnostic performance on target domain data.

[0058] This invention relates to a multimodal intelligent bearing fault diagnosis method based on transfer learning, aiming to solve cross-domain fault diagnosis problems using unsupervised domain adaptive methods. In the feature extraction stage, multimodal feature fusion in the time and frequency domains is completed. By employing adversarial and difference measurement approaches, the marginal and conditional distributions between the source and target domains are simultaneously adapted. Considering the characteristics of different transfer task data, the weights of different distributions are adjusted during adaptation, forming a dynamic joint distribution adaptive method. This allows the method to better learn fault diagnosis knowledge and improve transfer performance.

Claims

1. A multimodal intelligent bearing fault diagnosis method based on transfer learning, characterized in that, The implementation method includes the following steps: Step 1: Collect raw data and preprocess it: Raw vibration signals under different working conditions were collected as source domain data and target domain data, respectively. The dataset was expanded by sliding window, and two-dimensional waveforms were plotted from the two observation angles in the time domain and frequency domain. Then, the source domain samples and target domain samples were divided into training set and test set in an 8:2 ratio. Step 2: Construct a deep transfer learning network model: The deep transfer learning network model consists of three main parts: a feature extraction module, a dynamic joint distribution adaptive module, and a classification module; among which the feature extraction module... The process consists of two parts: feature extraction and multimodal information fusion. At the feature level, it fuses multimodal information from the time and frequency domains of data from the same source to obtain fused features; the classification module... The fused features are predicted and classified using a fully connected layer, and the predicted classification result is obtained through a Softmax function, resulting in a classification loss. The dynamic joint distribution adaptive module includes a domain discriminator. and subclass measurement modules, Based on the adversarial approach, the marginal probability distributions of global data in the source and target domains are aligned to obtain the global adversarial loss; the subclassing metric module uses the Local Maximum Mean Difference (LMMD) algorithm to calculate the conditional probability distribution and obtain the local difference loss; a balance factor is introduced according to the data characteristics of different domains. Different weights are assigned to edge adaptation and conditional adaptation to form a dynamic joint distribution adaptive mechanism; The above formula is the balance factor. The expression, To represent the difference between the source and target domains, the labels for source domain features are defined as 0, and the labels for target domain features are defined as 1. An SVM classifier using a linear kernel function is constructed to distinguish whether the input sample comes from the source or target domain. To determine the probability of a classifier being correct. Indicates the difference in the conditional domain. ; The overall optimization objective of the deep migration network is shown in the following formula: In the above formula Represents classification loss. Indicates global domain adversarial loss. Indicates local difference loss. and These represent the weights of each part. , and These represent the network parameters of the feature extraction module, the domain discriminator, and the classification module, respectively. Step 3: Train the deep transfer network model: The training data of the source domain and target domain in step 1 are input into the deep transfer network constructed in step 2. The sum of the classification loss, global adversarial loss and local difference loss in step 2 is used as the final loss of the network model. The gradient descent algorithm is used for iterative training to optimize the parameters and obtain the trained transfer diagnostic model. Step 4: Test the deep transfer network model: Input the test set data of the target domain in step 1 into the deep transfer diagnostic model trained in step 3 to test the diagnostic performance.

2. The intelligent diagnosis method for multimodal bearing faults based on transfer learning according to claim 1, characterized in that, The original vibration signal is the time domain data, and the frequency domain data is obtained by performing a fast Fourier transform on the original data, which in turn creates a two-dimensional image.

3. The intelligent diagnosis method for multimodal bearing faults based on transfer learning according to claim 1, characterized in that, In step 2, the feature extraction module The Vision Transformer is used as the backbone network for feature extraction, and a cross-attention mechanism is used as the core component for multimodal fusion to enhance the correlation between key information; a domain discriminator is also employed. WGAN-GP is used to avoid the gradient vanishing problem and to fit the optimal Wasserstein distance between the source and target domains, which helps the feature extraction module. Domain-invariant features; Classification loss The cross-entropy loss function is used to calculate the loss between the true labels of the source domain data samples and the predicted labels after passing through the deep transfer network. The expression is as follows: and These represent the number of samples and the number of categories, respectively. Global domain adversarial loss The loss function expression used is as follows: In the formula These represent the number of samples in the source and target domains, respectively. This represents the domain label; the domain label for the source domain data is set to 0, and the domain label for the target domain data is set to 1. Domain discriminator The loss, Indicates gradient penalty; Local difference loss The loss function expression used is as follows: In the above formula This indicates that each sample belongs to a category. The weight, This represents a kernel function that maps data to a higher-dimensional space.

4. The intelligent diagnosis method for multimodal bearing faults based on transfer learning according to claim 1, characterized in that, In step 3, the processed training data from the source and target domains are simultaneously input into the deep transfer learning network. During training, both the source and target domain data need to pass through the feature extraction module. The fusion features are obtained, so the feature extraction network structures and weights of the two are consistent. By optimizing the target in step 2, the marginal probability distribution and conditional probability distribution of the source domain and target domain data are dynamically aligned, thereby enhancing the transfer effect and improving the model's generalization ability to the target domain data.