A bearing fault diagnosis method based on deep transfer learning under varying operating conditions
By combining deep transfer learning and multi-kernel maximum mean difference measurement with a balanced distribution domain adaptation method based on fusion marginal criteria, the problems of insufficient training samples and distribution differences in bearing fault diagnosis are solved, achieving high-accuracy fault diagnosis, which is applicable to bearing fault identification in rotating machinery.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ANHUI UNIV
- Filing Date
- 2023-11-30
- Publication Date
- 2026-06-30
Smart Images

Figure CN117629635B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of bearing fault diagnosis methods, specifically a bearing fault diagnosis method based on deep transfer learning under varying operating conditions. Background Technology
[0002] In recent years, with the rapid development of artificial intelligence technologies such as data mining, deep learning, and transfer learning, more and more researchers at home and abroad have conducted research on bearing fault diagnosis based on artificial intelligence and achieved fruitful results.
[0003] In reference 1 [Zhao Xiaoqiang, Zhang Yuchun. Fault diagnosis method for rolling bearings based on dual-path parallel multi-scale ResNet [J]. Vibration and Shock, 2023, 42(03):199-208.], Zhao Xiaoqiang et al. addressed the problem that traditional machine learning methods cannot adaptively extract fault feature information of bearings under complex and variable working conditions. Based on residual neural networks, they designed a dual-path parallel multi-scale residual neural network with an attention mechanism, which improved the network's ability to adaptively extract fault features and achieved a higher fault diagnosis accuracy.
[0004] In reference 2 [Wang Zheng, Wen Chuanbo, Dong Yifan. A method for fault diagnosis of rolling bearings based on wavelet transform and involution convolutional neural network [J]. Bearing, 2022(11):61-67.], Wang Zheng et al. proposed a method for fault diagnosis of rolling bearings based on involution convolutional neural network. By performing secondary feature mining on the pixel points of the time-frequency feature map through involution, the accuracy of fault diagnosis is effectively improved.
[0005] In reference 3 [Pan Xiaobo, Ge Kunpeng, Dong Fei. Intelligent fault diagnosis of hoist bearings based on feature transfer learning [J]. Industrial and Mining Automation, 2022, 48(09):1-7+32.], Pan Xiaobo et al. proposed a cross-domain fault diagnosis method for bearings based on feature transfer learning. By performing transfer feature selection on a high-dimensional deep feature set, a feature subset more beneficial to cross-domain fault diagnosis is obtained. Then, by balancing distribution adaptation, the distribution difference between the source domain and the target domain samples is reduced, thereby improving the cross-domain fault diagnosis performance of the fault diagnosis model.
[0006] In reference 4 [Mao W, Feng W, Liu Y, et al. A new deep auto-encoder method with fusing discriminant information for bearing fault diagnosis[J].MechanicalSystems and Signal Processing,2021,150:107233.], Mao et al. designed and fused a new structural discriminant information loss function based on a deep autoencoder, which effectively improved the fault identification accuracy and the model's generalization ability.
[0007] In reference 5 [Hu Q, Si X, Qin A, et al. Balanced adaptation regularization based transfer learning for unsupervised cross-domain fault diagnosis[J].IEEE Sensors Journal,2022,22(12):12139-12151.], Hu et al. proposed a new balanced adaptive regularization method that comprehensively considers the relationship between conditional probability distribution and marginal probability distribution. This method effectively reduces the distribution difference while improving the model's adaptive ability, thereby improving the accuracy of cross-domain fault diagnosis.
[0008] Although the abundant research results in recent years have effectively improved the fault diagnosis performance of bearings under varying operating conditions, the above-mentioned literature methods and similar fault diagnosis methods still face important challenges when applied to bearing fault diagnosis in actual industrial scenarios: (1) Deep learning-based models often require a large number of training samples with complete fault categories to achieve ideal fault diagnosis performance under varying operating conditions, but it is difficult to obtain sufficient and complete fault samples in actual industrial scenarios; (2) Most artificial intelligence-based fault diagnosis models do not fully consider the performance reduction of fault diagnosis models caused by imbalance of training samples; (3) The difference in sample distribution caused by the varying operating conditions of equipment makes it difficult for fault diagnosis models trained with limited samples to achieve ideal fault diagnosis accuracy.
[0009] To address the issues of difficulty in obtaining training samples and failure to consider imbalance and differences in training samples in the aforementioned literature methods, the existing technology, as disclosed in the February 2023 issue of the journal *Vibration, Testing & Diagnosis*, entitled "Rolling Bearing Fault Diagnosis Based on One-Dimensional CNN Transfer Learning," presents a bearing fault diagnosis method based on CNN transfer learning. The main content and steps are as follows: First, a one-dimensional convolutional neural network (CNN) model capable of directly processing bearing vibration signals is constructed and pre-trained using source domain data alignment. Second, the maximum mean discrepancy (MMD) is used to measure the feature distribution distance between the source domain (labeled fault data under known operating conditions) and the target domain (unlabeled fault data under other operating conditions) at each layer of the pre-trained CNN model. This distance is used as a criterion for judging whether each layer of the CNN network has transferred its features. Layers that cannot transfer their features are initialized to complete the model, resulting in a transferable CNN model. Finally, the CNN model is retrained using a small amount of labeled target domain data to obtain a cross-domain fault diagnosis model that performs pattern recognition and classification on the target domain data, achieving the bearing fault diagnosis accuracy under varying operating conditions.
[0010] Although the rolling bearing fault diagnosis method based on one-dimensional CNN transfer learning published in the aforementioned journal paper solves the problem of difficulty in obtaining training samples in bearing fault diagnosis and takes into account the imbalance and differences in training samples, the following problems still exist:
[0011] (1) The fault diagnosis model is constructed by combining the original vibration signal of the bearing with a one-dimensional convolutional neural network, which does not give full play to the powerful ability of the convolutional neural network to extract deep features from two-dimensional image data; resulting in insufficient adaptive extraction capability of bearing fault features with high separability and domain adaptability.
[0012] (2) In the process of transfer learning in convolutional neural networks, the marginal probability distribution MMD is used alone to measure the distribution distance between the source domain and the target domain. This does not fully consider the difference between the distributions of the source domain and the target domain, and it does not take into account the conditional probability distribution distance.
[0013] (3) The problem of difficulty in obtaining sufficient and complete fault samples of equipment in actual industrial scenarios was not fully considered, nor was the problem of poor performance and weak generalization ability of deep learning models caused by imbalance of training samples fully considered. Summary of the Invention
[0014] This invention provides a bearing fault diagnosis method based on deep transfer learning, in order to solve the problems of existing rolling bearing fault diagnosis methods based on one-dimensional CNN transfer learning published in existing journal papers.
[0015] To achieve the above objectives, the technical solution adopted by the present invention is as follows:
[0016] A bearing fault diagnosis method based on deep transfer learning under varying operating conditions includes the following steps:
[0017] Step 1: Obtain the fault vibration signal data of the bearing under various known operating conditions and with corresponding fault category labels as the labeled source domain, and obtain the fault vibration signal data of the bearing under other operating conditions different from the known operating conditions without fault category labels as the unlabeled target domain.
[0018] Step 2: Obtain the two-dimensional time-frequency diagram of the source domain and the two-dimensional time-frequency diagram of the target domain as described in Step 1;
[0019] Step 3: Generate a convolutional neural network (CNN) model. Use the two-dimensional time-frequency map of the source domain obtained in Step 2 to pre-train the CNN model to obtain a pre-trained CNN model. Use the data output by the fully connected layer of the CNN model during the pre-training process as the labeled source domain deep feature set.
[0020] Step 4: Based on the two-dimensional time-frequency diagrams of the bearings in the source and target domains under normal conditions without faults or defects, and using the multi-core maximum mean difference MK-MMD as the metric parameter, the layers of the pre-trained CNN model obtained in Step 3 are transferred and adapted to the target domain to obtain a transfer CNN model suitable for the target domain.
[0021] Step 5: Select some two-dimensional time-frequency maps under certain fault conditions from the two-dimensional time-frequency maps of the target domain to fine-tune the transfer CNN model obtained in Step 4. The trained transfer CNN model will adaptively mine deep features from the remaining two-dimensional time-frequency maps of the target domain to obtain an unlabeled target domain deep feature set.
[0022] Step 6: Using the balanced distribution domain adaptation method based on the fusion margin criterion, the source domain depth feature set obtained in Step 3 and the target domain depth feature set obtained in Step 5 are subjected to domain adaptation processing to reduce the distribution difference between the target domain depth feature set and the target domain depth feature set, so as to obtain a new source domain depth feature set with labels and a new target domain depth feature set without labels.
[0023] Step 7: Use the labeled new source domain deep feature set obtained in Step 6 to train a machine learning classifier, and then use the trained machine learning classifier to predict the fault category label of the unlabeled new target domain deep feature set, thereby realizing the fault diagnosis of bearings under varying working conditions.
[0024] In the further step 2, wavelet transform is used to process the source domain and the target domain respectively to obtain the two-dimensional time-frequency diagram of the source domain and the two-dimensional time-frequency diagram of the target domain.
[0025] In the further step 4, the two-dimensional time-frequency graphs of the bearings in the source domain under normal conditions with no faults and defects, and the two-dimensional time-frequency graphs of the bearings in the target domain under normal conditions with no faults and defects are respectively input into the pre-trained CNN model. The multi-core maximum mean difference (MK-MMD) is used to measure the marginal probability distribution distance between the feature data output by the two-dimensional time-frequency graphs of the source domain and the target domain at each layer of the pre-trained CNN model, as well as the conditional probability distribution distance between the feature data output by the two-dimensional time-frequency graphs of the source domain and the target domain at each layer of the pre-trained CNN model. The sum of the marginal probability distribution distance and the conditional probability distribution distance is used as the migration judgment index data for whether the corresponding layer of the pre-trained CNN model can perform parameter migration. Based on the calculated migration judgment index data of each layer of the CNN model, the parameter migration of each layer of the pre-trained CNN model is completed, and a migration CNN model suitable for the target domain is obtained.
[0026] Furthermore, the calculated transfer evaluation index data for each layer of the pre-trained CNN model is compared with a preset threshold. When the transfer evaluation index data for a certain layer of the pre-trained CNN model is less than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model can be transferred to the transfer CNN model. When the transfer evaluation index data for a certain layer of the CNN model is greater than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model cannot be transferred to the transfer CNN model. At this time, the corresponding layer of the transfer CNN model is completed by initialization, thereby completing the parameter transfer of each layer of the pre-trained CNN model.
[0027] Furthermore, the balanced distribution domain adaptation method based on the fusion marginal criterion described in step 6 achieves balanced distribution alignment of data in the source domain deep feature set and the target domain deep feature set through the balanced distribution adaptation method, then fuses the maximum marginal criterion matrix maintained by the local neighborhood, and introduces a dynamic balance adaptive factor, thereby reducing the distribution difference between the target domain deep feature set and the target domain deep feature set.
[0028] Furthermore, the machine learning classifier mentioned in step 7 is a K-nearest neighbor classifier, or a support vector machine, or a random forest classifier.
[0029] This invention employs a CNN model to extract deep features. First, the deep feature set of the source domain is obtained by pre-training the CNN model using a two-dimensional time-frequency image of the source domain. Then, the distance between the edge probability distributions and conditional probability distributions of each layer in the pre-trained CNN model is measured using multi-kernel maximum mean difference (MK-MMD). Based on these distances, transfer judgment index data is constructed, thereby completing the transfer of the pre-trained CNN model and obtaining a transferable CNN model applicable to the target domain for acquiring the deep feature set of the target domain. This approach fully leverages the powerful ability of convolutional neural networks to mine deep features from two-dimensional image data, overcoming the limitations of existing bearing fault diagnosis methods based on one-dimensional CNN transfer learning, which still suffer from insufficient adaptive extraction capabilities for bearing fault features with high separability and domain adaptability. Furthermore, this invention uses two probability distribution distances based on MK-MMD to measure the distribution difference between the source and target domains, overcoming the problem of incomplete measurement of the distribution difference between the source and target domains in one-dimensional CNN transfer learning-based bearing fault diagnosis methods.
[0030] To further improve cross-domain fault diagnosis performance, this invention proposes a balanced distribution domain adaptation method based on a fusion marginal criterion. This method performs domain adaptation processing on the source and target domain deep feature sets. Based on the balanced distribution alignment of data in the source and target domain deep feature sets, it considers class label information and the preservation of neighborhood relationships between samples during the adaptive dynamic distribution alignment process. It fuses the maximum marginal criterion matrix for local neighborhood preservation and introduces a dynamic balance adaptive factor. This reduces the distribution difference between the source and target domain deep features while improving the separability of the feature data, reduces the distribution difference between the target domain deep feature sets, and enhances the classification performance of the constructed cross-domain fault diagnosis model. Therefore, this invention also overcomes the problem of weak cross-domain fault diagnosis performance of bearing fault diagnosis methods based on one-dimensional CNN transfer learning when faced with insufficient training samples.
[0031] This invention employs a CNN model to extract deep features. First, the deep feature set of the source domain is obtained by pre-training the CNN model using a two-dimensional time-frequency plot of the source domain. Then, the distance between the edge probability distributions and conditional probability distributions of each layer in the pre-trained CNN model is measured using the multi-kernel maximum mean difference. Based on these distances, transfer judgment index data is constructed, thereby completing the transfer of the pre-trained CNN model and obtaining a transferable CNN model applicable to the target domain for acquiring its deep feature set. This method comprehensively measures and considers the distribution differences between the source and target domains during the transfer learning process of the deep network model, resulting in a transferable CNN model with excellent feature extraction capabilities for the deep features of the target domain.
[0032] In this invention, a balanced distribution domain adaptation method based on the fusion margin criterion is adopted to perform domain adaptation processing on the source domain deep feature set and the target domain deep feature set. This can reduce the distribution differences between different domains while improving the separability of feature data. The source domain obtained as training samples can effectively improve the performance and generalization ability of the machine learning classifier when training it, thereby enabling the machine learning classifier to have better fault diagnosis performance.
[0033] In summary, compared with the prior art, the beneficial effects of the present invention are as follows:
[0034] (1) The present invention can achieve ideal cross-domain fault diagnosis performance. Based on the actual bearing fault dataset, balanced and unbalanced training sample conditions are set respectively. The maximum fault diagnosis accuracy can reach 100%, and the average fault diagnosis accuracy can reach 99% and 98% or more respectively.
[0035] (2) Compared with the fault diagnosis model constructed by the classic deep learning and transfer learning methods, the present invention can achieve significantly better cross-domain fault diagnosis performance and has the potential to be applied to the identification of bearing fault status in industrial scenarios. Attached Figure Description
[0036] Figure 1 This is a schematic diagram of the method in an embodiment of the present invention.
[0037] Figure 2 This is a flowchart illustrating the training, transfer learning, and deep feature extraction process of the Convolutional Neural Network (CNN) model in this embodiment of the invention. Detailed Implementation
[0038] To enable those skilled in the art to better understand the present invention, the embodiments will be described in detail below with reference to the accompanying drawings and examples. This will allow for a full understanding of how the present invention uses technical means to solve technical problems and achieve corresponding technical effects, and to facilitate its implementation. The embodiments of the present invention and the various features within them can be combined with each other without conflict, and all resulting technical solutions are within the protection scope of the present invention.
[0039] Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without inventive effort should fall within the scope of protection of the present invention.
[0040] It should be noted that the terms "comprising" and "having" and any variations thereof in the specification, claims, and accompanying drawings of this invention are intended to cover non-exclusive inclusion.
[0041] like Figure 1As shown in the figure, this embodiment discloses a bearing fault diagnosis method based on deep transfer learning, including the following steps:
[0042] Step 1: Obtain bearing operating conditions under various known conditions. Figure 1 The fault vibration signal data under medium operating condition 1) with corresponding fault category labels are used as the labeled source domain, and the bearing is acquired under other operating conditions different from the known operating conditions. Figure 1 The fault vibration signal data of the unlabeled category in the medium operating condition 2) is used as the unlabeled target field. The operating condition refers to information such as the speed and load of the bearing during operation.
[0043] Step 2: Perform wavelet transform on the source domain obtained in Step 1 to obtain a two-dimensional time-frequency plot of the source domain, and perform wavelet transform on the target domain obtained in Step 1 to obtain a two-dimensional time-frequency plot of the target domain.
[0044] Step 3, as follows Figure 2 As shown, a convolutional neural network (CNN) model is generated. The parameters of each layer of the CNN model in this embodiment are shown in Table 1:
[0045] Table 1 CNN Model Parameter Table
[0046]
[0047] The two-dimensional time-frequency graph of the source domain obtained in step 2 is input into the CNN model to pre-train the CNN model, resulting in a pre-trained CNN model. During the pre-training process, the data output from the two fully connected layers of the CNN model are used to construct a labeled source domain deep feature set.
[0048] Step 4: Since Step 2 yields a pre-trained CNN model based on the source domain time-frequency graph, and due to differences in data distribution between the source and target domains, this pre-trained CNN model cannot be directly applied to the target domain. Therefore, this embodiment uses multi-kernel maximum mean discrepancy (MK-MMD) to measure the differences in marginal and conditional probability distributions of output feature data at each layer of the pre-trained CNN model between the source and target domains. This allows for the transfer adaptation of each layer of the pre-trained CNN model obtained in Step 3 to the target domain, taking full account of the distribution differences, thus obtaining a transfer CNN model suitable for the target domain. Figure 2 As shown, the process is as follows:
[0049] (4.1) Input the two-dimensional time-frequency maps of the source domain and the target domain into the pre-trained CNN model respectively. Use multi-kernel MMD to measure the marginal probability distribution distance of the two-dimensional time-frequency map data of the source domain and the target domain in each layer of the convolutional layer and the fully connected layer in the pre-trained CNN model, as well as the conditional probability distribution distance of the two-dimensional time-frequency map data of the source domain and the target domain in each layer of the convolutional layer and the fully connected layer in the pre-trained CNN model. The multi-kernel MMD of each layer of the pre-trained CNN model is shown in formula (1):
[0050]
[0051] In formula (1):
[0052] This represents the distance between the marginal probability distributions of the source and target domains in the output feature data of each layer in the pre-trained CNN model.
[0053] This represents the distance between the conditional probability distributions of the source and target domains in the output feature data of each layer in the pre-trained CNN model.
[0054] MK-MMD stands for Multi-core MMD, which is the sum of the distance between the marginal probability distribution and the conditional probability distribution.
[0055] Z S Z represents the two-dimensional time-frequency plot data of the source domain. T This represents the two-dimensional time-frequency plot data of the target domain. i For the source domain sample, z j For the target domain samples. S n is the number of samples in the source domain. T The number of samples in the target domain.
[0056] φ() is a mapping transformation function used to map and transform data to the reproducing kernel Hilbert space H. k .
[0057] c represents the category to which the sample belongs, C represents the total number of categories of the sample, and c∈[1,C].
[0058] Let be the number of samples belonging to class c in the source domain. This represents the number of samples belonging to class c in the target domain.
[0059] For the c-th class sample in the source domain, For the c-th class sample in the target domain.
[0060] H k H is a regenerated kernel Hilbert space for multi-core kernels. k The expression is shown in formula (2):
[0061]
[0062] In formula (2): where K is the total number of kernels, Indicates bandwidth θ j The Gaussian kernel. Z i S Z represents the i-th sample in the source domain. i T This represents the i-th sample in the target domain.
[0063] When calculating formula (1), the category label of the target domain data needs to be used. Since the target domain data does not have a category label in this embodiment, this embodiment uses a pre-trained base classifier of the source domain data to predict the category label of the target domain data.
[0064] (4.2) Using the multi-kernel MMD values of each layer of the pre-trained CNN model obtained in step (4.1) (i.e., the MK-MMD obtained in formula (1)) as transfer judgment index data, the calculated transfer judgment index data of each layer of the pre-trained CNN model is compared with the preset threshold to determine whether each convolutional layer and fully connected layer in the pre-trained CNN model can perform parameter transfer. The judgment process is as follows:
[0065] When the transfer evaluation index data of a certain layer of the pre-trained CNN model is less than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model can be transferred to the transfer CNN model.
[0066] When the transfer evaluation index of a certain layer of the CNN model is greater than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model cannot be transferred to the transfer CNN model. At this time, the network parameters of that layer are randomly initialized and the corresponding layer of the transfer CNN model is completed.
[0067] Thus, the parameters of each layer of the pre-trained CNN model are transferred to the transfer CNN model, resulting in a transfer CNN model suitable for the target domain.
[0068] Step 5, as follows Figure 2 As shown, from the two-dimensional time-frequency map of the target domain, select the two-dimensional time-frequency map of some fault states to fine-tune the transfer CNN model obtained in step 4, and input the remaining two-dimensional time-frequency map of the target domain into the trained transfer CNN model. Through the trained transfer CNN model, more beneficial deep features for fault mode recognition and classification are adaptively mined from the unlabeled two-dimensional time-frequency map of the target domain, thereby obtaining the unlabeled target domain deep feature set.
[0069] In this embodiment, for the two-dimensional time-frequency maps of the target domain, 10% of the total number of two-dimensional time-frequency maps of the target domain can be selected for fine-tuning training of the transfer CNN model, and the two-dimensional time-frequency maps used for fine-tuning training are all selected from the two-dimensional time-frequency maps under the fault state; the remaining 90% of the two-dimensional time-frequency maps of the target domain are used as input to the trained transfer CNN model.
[0070] Step 6: Although the powerful feature extraction capability of convolutional neural networks can obtain deep features containing fault information, the complex and variable working conditions of bearings still lead to distribution differences between the deep feature data in the source and target domains. If the fault pattern recognition and classification model is trained based solely on the extracted deep feature set, it is still difficult to achieve ideal fault diagnosis performance. In addition, insufficient training data and imbalance between classes will further weaken the generalization ability of the trained fault diagnosis model, making it difficult to obtain a high fault diagnosis accuracy.
[0071] To effectively improve the separability of feature data while reducing the distribution difference between the source domain deep feature set and the target domain deep feature set, this embodiment adopts the Balance distribution adaptation integrating Marginal Criteria (BDAIMC) method. The labeled source domain deep feature set obtained in step 3 and the unlabeled target domain deep feature set obtained in step 5 are subjected to domain adaptation processing to reduce the distribution difference between the two target domain deep feature sets. This achieves adaptive dynamic balanced distribution adaptation while improving data separability, and yields a new labeled source domain deep feature set and a new unlabeled target domain deep feature set.
[0072] In this embodiment, the Balanced Distribution Domain Adaptation (BDAIMC) method based on the fusion marginal criterion is based on the Balanced Distribution Adaptation (BDA) method, which achieves balanced distribution alignment of data in the source domain deep feature set and the target domain deep feature set. During the adaptive dynamic distribution alignment process, it considers class label information and the preservation of neighborhood relationships between samples, fuses the maximum marginal criterion matrix for local neighborhood preservation, and introduces a dynamic balance adaptive factor. This reduces the distribution difference between the source domain and the target domain deep features while improving the separability of feature data, reduces the distribution difference between the target domain deep feature set and the target domain deep feature set, and enhances the classification performance of the constructed cross-domain fault diagnosis model.
[0073] The specific explanation of BDAIMC in this embodiment is as follows:
[0074] In this embodiment, the labeled source domain depth feature set obtained in step 3 and the unlabeled target domain depth feature set obtained in step 5 are used as input data for the Balanced Distribution Domain Adaptation Method BDAIMC (hereinafter referred to as BDAIMC) based on the fusion marginal criterion. The maximum marginal criterion matrix that preserves the local neighborhood is used as the input data. It incorporates a dynamic distribution alignment target and introduces a dynamic balance adaptive factor to achieve adaptive adjustment of the proportions of MPD and CPD.
[0075] Labeled source domain deep feature set D S_label ={X S ,Y S}, where: X S For source domain feature data, Let m represent the number of samples and d represent the dimension of the feature data for the i-th source domain feature data; Y S For X S Category tag set, } represents the category label of the k-th source domain feature data sample.
[0076] Let D be the unlabeled target domain deep feature set. T ={X T}, where: X T Represents the feature data of the target domain. This is the j-th target domain data sample.
[0077] The goal of BDAIMC is to achieve a deep feature set D based on the source domain. S_label And unlabeled target domain data D T We learn to obtain the mapping transformation matrix V, such that X S and X T The data distribution difference is minimized after being transformed by the mapping matrix V, and the optimization objective expression is shown in formula (3):
[0078]
[0079] In formula (3): β is the introduced dynamic equilibrium adaptive factor, and its value range is β∈[0,1].
[0080] a is the maximum marginal criterion matrix M. n The adjustment factor has a value range between 0 and 1.
[0081] tr() means taking the trace of a matrix.
[0082] is the Frobenius canonical regularization term, and μ is the trade-off parameter.
[0083] X is the source and target domain deep feature set data matrix, i.e., X = [X S ,X T ]; X T Let X be the transpose of X.
[0084] H0 is the central matrix, and I is the identity matrix.
[0085] J M (D S_label D T J represents the marginal probability distribution distance between the source and target domain deep feature sets. C (D S_label D T ) represents the conditional probability distribution distance between the feature sets of the source and target domains.
[0086] Wherein, the marginal probability distribution distance J M (D S_label D T The expression for ) is shown in formula (4):
[0087]
[0088] In formula (4): n S n is the number of samples in the source domain. T The number of samples in the target domain.
[0089] x i For the source domain sample, x j For the target domain sample.
[0090] U T Let U be the transpose of the mapping transformation matrix U.
[0091] φ() is a mapping transformation function used to map and transform data to the reproducing kernel Hilbert space H.
[0092] L0 is the MMD matrix of the marginal probability distribution, and the expression for L0 is shown in formula (5):
[0093]
[0094] Distance J of conditional probability distribution C (D S_label D T The expression for ) is shown in formula (6):
[0095]
[0096] In formula (6): c is the category to which the sample belongs, C is the total number of categories of the sample, and c∈[1,C].
[0097] Let be the number of samples belonging to class c in the source domain. This represents the number of samples belonging to class c in the target domain.
[0098] For the c-th class sample in the source domain, For the c-th class sample in the target domain.
[0099] The calculation of formula (6) requires the class label of the target domain sample. To this end, a pre-trained base classifier of the source domain sample is used to predict the class label of the target domain sample.
[0100] L c Let L be the MMD matrix of the conditional probability distribution. c The expression for is shown in formula (7):
[0101]
[0102] Maximum marginal criterion matrix M n The expression is shown in formula (8):
[0103]
[0104] In formula (8), The data within-class scatter matrix that preserves local neighborhood relationships. The inter-class scatter matrix preserves local neighborhood relationships, where I is the identity matrix. and The expressions are shown in formulas (9) and (10):
[0105]
[0106]
[0107] In formulas (9) and (10), and It is derived from the weight matrix in the linear discriminant criterion of the classic dimensionality reduction method, and its expression is as follows:
[0108]
[0109]
[0110] In formulas (11) and (12): m is the number of samples, c is the category to which the sample belongs, and m l This is the number of samples belonging to class c. j∈Nst(i) indicates that the j-th sample is the nearest neighbor of the i-th sample. The introduction of this feature can preserve the neighborhood relationship between samples, which is more beneficial to improving the separability of multimodal data samples.
[0111] F ij for and The adjustment parameter, F ij The value will change according to the distance between samples of different classes, thereby increasing the inter-class divergence and decreasing the intra-class divergence, which enhances the cohesion of the data while expanding the inter-class distance.
[0112] Therefore, by using the Balanced Distribution Domain Adaptation Method (BDAIMC) based on the fusion margin criterion, the labeled source domain depth feature set obtained in step 3 and the unlabeled target domain depth feature set obtained in step 5 are processed to obtain a new labeled source domain depth feature set D′. S_label ={P S ,Y S}, Unlabeled new target domain deep feature set D′ T ={P T}, where P S This refers to source domain deep feature data, Y S P refers to the set of category labels corresponding to the source domain deep feature data. T This refers to the deep feature data of the target domain.
[0113] Step 7: Utilize the labeled new source domain depth feature set D′ obtained in Step 6 S_label In this embodiment, the machine learning classifier is trained. It is either a K-nearest neighbor classifier, a support vector machine, or a random forest classifier.
[0114] Then, the unlabeled new target domain deep feature set D′ T The trained machine learning classifier is used for identification and classification. The trained machine learning classifier is then used to predict the fault category label of the new unlabeled target domain deep feature set, thereby obtaining the fault diagnosis result of the target domain data and realizing the fault diagnosis of bearings under variable working conditions.
[0115] The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings. These embodiments are merely descriptions of preferred embodiments and are not intended to limit the scope or concept of the invention. The specific technical features described in the above embodiments can be combined in any suitable manner without contradiction. Such combinations, as long as they do not violate the spirit of the present invention, should also be considered as part of this disclosure. To avoid unnecessary repetition, the present invention will not further describe the various possible combinations.
[0116] This invention is not limited to the specific details of the above embodiments. Within the scope of the technical concept of this invention and without departing from the design idea of this invention, all modifications and improvements made by those skilled in the art to the technical solutions of this invention should fall within the protection scope of this invention. The technical content for which protection is sought in this invention has been fully described in the claims.
Claims
1. A bearing fault diagnosis method based on deep transfer learning under varying operating conditions, characterized in that, Includes the following steps: Step 1: Obtain the fault vibration signal data of the bearing under various known operating conditions and with corresponding fault category labels as the labeled source domain, and obtain the fault vibration signal data of the bearing under other operating conditions different from the known operating conditions without fault category labels as the unlabeled target domain. Step 2: Obtain the two-dimensional time-frequency diagram of the source domain and the two-dimensional time-frequency diagram of the target domain as described in Step 1; Step 3: Generate a convolutional neural network (CNN) model. Use the two-dimensional time-frequency map of the source domain obtained in Step 2 to pre-train the CNN model to obtain a pre-trained CNN model. Use the data output by the fully connected layer of the CNN model during the pre-training process as the labeled source domain deep feature set. Step 4: Based on the two-dimensional time-frequency diagrams of the bearings in the source and target domains under normal conditions without faults or defects, and using the multi-core maximum mean difference MK-MMD as the metric parameter, the layers of the pre-trained CNN model obtained in Step 3 are transferred and adapted to the target domain to obtain a transfer CNN model suitable for the target domain. Step 5: Select some two-dimensional time-frequency maps under certain fault conditions from the two-dimensional time-frequency maps of the target domain to fine-tune the transfer CNN model obtained in Step 4. The trained transfer CNN model will adaptively mine deep features from the remaining two-dimensional time-frequency maps of the target domain to obtain an unlabeled target domain deep feature set. Step 6: Using the balanced distribution domain adaptation method based on the fusion margin criterion, the source domain depth feature set obtained in Step 3 and the target domain depth feature set obtained in Step 5 are subjected to domain adaptation processing to reduce the distribution difference between the source domain depth feature set and the target domain depth feature set, so as to obtain a new labeled source domain depth feature set and a new unlabeled target domain depth feature set. Step 7: Use the labeled new source domain deep feature set obtained in Step 6 to train a machine learning classifier, and then use the trained machine learning classifier to predict the fault category label of the unlabeled new target domain deep feature set, thereby realizing the fault diagnosis of bearings under varying working conditions. The balanced distribution domain adaptation method based on the fusion marginal criterion described in step 6 achieves balanced distribution alignment of data in the source domain deep feature set and the target domain deep feature set through the balanced distribution adaptation method. It then fuses the maximum marginal criterion matrix maintained by the local neighborhood and introduces a dynamic balance adaptive factor, thereby reducing the distribution difference between the source domain deep feature set and the target domain deep feature set.
2. The bearing fault diagnosis method based on deep transfer learning according to claim 1, characterized in that, In step 2, wavelet transform is used to process the source domain and the target domain respectively to obtain the two-dimensional time-frequency diagram of the source domain and the two-dimensional time-frequency diagram of the target domain.
3. The bearing fault diagnosis method based on deep transfer learning according to claim 1, characterized in that, In step 4, the two-dimensional time-frequency graphs of the bearings in the source domain under normal conditions with no faults and defects, and the two-dimensional time-frequency graphs of the bearings in the target domain under normal conditions with no faults and defects, are respectively input into the pre-trained CNN model. The multi-core maximum mean difference (MK-MMD) is used to measure the marginal probability distribution distance between the feature data output by the two-dimensional time-frequency graphs of the source and target domains at each layer of the pre-trained CNN model, as well as the conditional probability distribution distance between the feature data output by the two-dimensional time-frequency graphs of the source and target domains at each layer of the pre-trained CNN model. The sum of the marginal probability distribution distance and the conditional probability distribution distance is used as the transfer judgment index data for whether the corresponding layer of the pre-trained CNN model can perform parameter transfer. Based on the calculated transfer judgment index data of each layer of the CNN model, the parameter transfer of each layer of the pre-trained CNN model is completed, and a transfer CNN model suitable for the target domain is obtained.
4. The bearing fault diagnosis method based on deep transfer learning according to claim 3, characterized in that, The calculated transfer evaluation index data for each layer of the pre-trained CNN model is compared with a preset threshold. When the transfer evaluation index data for a certain layer of the pre-trained CNN model is less than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model can be transferred to the transfer CNN model. When the transfer evaluation index data for a certain layer of the CNN model is greater than the preset threshold, it is determined that the parameters of the corresponding layer of the pre-trained CNN model cannot be transferred to the transfer CNN model. At this time, the corresponding layer of the transfer CNN model is completed by initialization, thereby completing the parameter transfer of each layer of the pre-trained CNN model.
5. The bearing fault diagnosis method based on deep transfer learning according to claim 1, characterized in that, The machine learning classifier mentioned in step 7 is a K-nearest neighbor classifier, a support vector machine, or a random forest classifier.