Train traction motor bearing state detection cross-device migration diagnosis method and system
By initializing wavelet weights in a neural network and incorporating a smoothing factor, a wavelet kernel is designed to address the problem of insufficient model robustness in cross-device transfer diagnostics, achieving higher cross-machine diagnostic accuracy and domain transferability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING JIAOTONG UNIV
- Filing Date
- 2024-01-25
- Publication Date
- 2026-06-23
AI Technical Summary
Existing algorithms suffer from insufficient model robustness in cross-device migration diagnosis, rely heavily on data quality and quantity, have high computational complexity, and ordinary wavelet kernels cannot effectively handle complex cross-machine transmission scenarios. Furthermore, there is a lack of exploration into whether wavelet weighting can reduce intra-domain differences.
A two-stream convolutional network initialized with wavelet weights is designed by initializing the weights of the first layer of the neural network with wavelet weights and combining smoothing and scaling factors to design a smoothing-enhanced wavelet kernel. The network is trained using the backpropagation algorithm to reduce the bias in the inter-domain probability distribution and outputs the fault category through Softmax.
It improves the effectiveness and accuracy of cross-machine diagnostics, enhances domain transferability, reduces the hassle of selecting signal processing algorithm parameters, and improves the robustness of data-driven models.
Smart Images

Figure CN118013331B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of safety inspection technology for rail transit infrastructure, specifically to a cross-equipment migration diagnosis method and system for high-speed train traction motor bearing condition detection based on vibration acceleration, signal processing, and transfer learning. Background Technology
[0002] With the development of the Industrial Internet of Things (IIoT), the manufacturability, integration, and precision of high-speed train traction motor mechanical rotating systems are constantly improving. However, their complexity, nonlinearity, and uncertainty are also significantly increasing, making fault diagnosis of their motor bearings a major challenge. During long-term operation, rotating machinery is affected by material degradation, load, temperature, and humidity, leading to the susceptibility of critical components to failure, thereby reducing factory efficiency and causing personnel injuries or environmental pollution. Therefore, monitoring the condition of rotating machinery is of great importance. However, due to the influence of equipment manufacturing and operation scenarios, the probability distributions of training data (source domain) and test data (target domain) inevitably deviate, rendering many existing algorithms ineffective in real-world scenarios, such as traction motor systems. Based on this industrial scenario, this poses a severe challenge to the extrapolation capabilities of the models in the algorithms, especially for cross-machine fault diagnosis tasks.
[0003] A series of studies have been conducted on cross-device transfer diagnostics. For example, in terms of fine-tuning, Luo et al. developed an improved stacked autoencoder; Zhang et al., inspired by small learning, fine-tuned a high-performance feature encoder; specifically for long-tailed distributed datasets, Li et al. prioritized source domain imbalanced samples to monitor fan failures; Liu et al. proposed a metadata-based residual transfer network to address target domain imbalance; Han et al., combining the idea of domain adversarial approaches, proposed a multi-domain discriminator network; in mapping-based methods, many scholars designed or applied various statistical measures, such as ensemble weighted maximum average difference, multi-kernel local maximum average difference, embedded joint maximum average difference, and maximum mean square difference. Besides reducing the distributional bias between the source and target domains, some statistical measures also consider discriminative features within and between classes. Furthermore, some scholars have integrated hybrid learning strategies, including small learning, meta-learning, metric learning, contrastive learning, causal learning, imbalanced learning, and multi-source domain adaptation. In addition, Jang et al. introduced a domain interpolation adaptive network. Wan et al. developed a multi-level domain adaptive network. In summary, cross-machine transfer diagnostics incorporates state-of-the-art training strategies, providing more powerful diagnostic performance for complex and more specific cross-machine diagnostic tasks.
[0004] However, most purely data-driven algorithms can weaken model robustness and exacerbate dependence on data quality and quantity. Meanwhile, the statistical metrics employed by many algorithms increase computational complexity and introduce additional hyperparameters, which often determine algorithm performance; however, searching for appropriate hyperparameters is typically time-consuming and laborious. Therefore, a method combining signal processing and Domain Adaptation Networks (DANs) has been introduced into cross-machine transfer diagnostics. Kim et al. argued that data preprocessing can reduce distributional bias between different datasets, helping to save training time and improve accuracy. Therefore, they used signal processing techniques to transform different datasets into a common pattern space. However, the aforementioned algorithms employ a two-stage strategy, and optimizing the hyperparameters of signal processing remains a significant challenge.
[0005] Wavelet transform, as a primary and reliable signal processing method, has broad application prospects when combined with domain adaptive networks. Some literature utilizes wavelet scattering modules to construct time-scattering convolutional networks for cross-domain diagnosis, rather than cross-machine diagnosis. Yue et al. proposed the Multiscale Wavelet Prototypical Network (MWPN), which combines wavelet kernel convolution with a small learning strategy to solve cross-component diagnosis problems. Shang et al. designed a Denoising Fault-Aware Wavelet Network (DFAWNet) and investigated its advantages in cross-velocity diagnosis. In summary, Laplace and Morlet wavelets demonstrate strong performance in wavelet-based interpretable diagnosis.
[0006] While some studies have utilized interpretable wavelet techniques, such as combining multiple wavelet kernels or introducing improved wavelet kernels, these efforts have primarily focused on fault detection tasks. Ordinary wavelet kernels cannot handle complex cross-machine transmission scenarios. Existing research lacks exploration into whether wavelet weighting can reduce intra-domain variability. Summary of the Invention
[0007] The purpose of this invention is to provide a method and system for cross-equipment migration diagnosis of the condition of traction motor bearings in high-speed trains, so as to solve at least one of the technical problems existing in the background art.
[0008] To achieve the above objectives, the present invention adopts the following technical solution:
[0009] In a first aspect, the present invention provides a method for cross-equipment migration diagnosis of train traction motor bearing condition detection, comprising:
[0010] Acquire vibration signals from the bearings of the traction motor system;
[0011] The acquired vibration signals of the traction motor system bearings are processed using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0012] Optionally, in training the network, the weights of the first layer are specific for the source and target domains, with no weight sharing; the dual-stream bottleneck layer uses smooth-enhanced wavelet weights without weight sharing to transform the data from the source and target domains into a common feature representation space; the intermediate bottleneck and classifier share weights, and the extracted features are fed into a metric function using WDCNN with batch normalization and ReLU activation functions to achieve implicit domain alignment; finally, Softmax is applied to output the fault category.
[0013] Optionally, the weights of the first layer of the neural network are initialized with improved wavelet weights, including: establishing the correlation between time and filter length to give the convolution kernel size a temporally physically interpretable meaning; aligning scaling and translation factors to further correlate with the output channel and strengthen the correlation with the convolution kernel; introducing a smoothing factor to form a smoothed and enhanced wavelet kernel; and using the Sigmoid function to correct the e-based exponential components of the wavelet basis function.
[0014] Optionally, the cross-machine migration wavelet weights can be initialized to reduce the inter-domain probability distribution bias in different mechanical fault data.
[0015] For the first convolutional layer, the response y1 is: y1 = W1x + b1
[0016] Where W1 is the weight and b1 is the bias;
[0017] Replace the original weights with improved wavelet weights:
[0018] The forward propagation mechanism is as follows:
[0019]
[0020]
[0021] In the formula, l represents the l-th layer.
[0022] Optionally, the backpropagation mechanism can be represented as:
[0023]
[0024]
[0025] Where α represents learning, θ1 represents the parameters that need to be updated in the first layer, and L total Indicates the total loss;
[0026] Using cross-entropy loss L cls Guide parameters updated:
[0027]
[0028] Where p(k) is the predicted distribution and q(k) is the actual distribution.
[0029] Optionally, the final loss function is expressed as:
[0030] L total =L cls +λL metric
[0031] Among them, L metric The loss is measured by the domain difference, and λ is a tradeoff coefficient that adjusts the proportion of the corresponding loss term during backpropagation.
[0032]
[0033] Secondly, the present invention provides a cross-equipment migration diagnostic system for train traction motor bearing condition detection, comprising:
[0034] The acquisition module is used to acquire the vibration signal of the bearing in the traction motor system;
[0035] The diagnostic module is used to process the acquired vibration signals of the traction motor system bearings using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0036] Thirdly, the present invention provides a non-transitory computer-readable storage medium for storing computer instructions, which, when executed by a processor, implement the cross-device migration diagnosis method for train traction motor bearing condition detection as described in the first aspect.
[0037] Fourthly, the present invention provides a computer device including a memory and a processor, wherein the processor and the memory communicate with each other, the memory stores program instructions that can be executed by the processor, and the processor calls the program instructions to execute the cross-device migration diagnosis method for train traction motor bearing condition detection as described in the first aspect.
[0038] Fifthly, the present invention provides an electronic device, comprising: a processor, a memory, and a computer program; wherein the processor is connected to the memory, the computer program is stored in the memory, and when the electronic device is running, the processor executes the computer program stored in the memory to cause the electronic device to execute instructions for implementing the cross-device migration diagnostic method for train traction motor bearing condition detection as described in the first aspect.
[0039] Terminology Explanation:
[0040] Source domain: refers to the dataset or domain in which the model is pre-trained.
[0041] Target domain: refers to the new dataset or field to which the model will be applied.
[0042] Cross-device diagnostics: Migrate the annotated data accumulated on device A to the unannotated data on device B to assess the operational health of device B.
[0043] Transferability: The ability to transfer a model learned in a source domain to a target domain.
[0044] Distinguishing ability: The ability of extracted features to be classified by a classifier.
[0045] Vanishing gradient: In neural networks, the learning rate of earlier hidden layers is lower than that of later hidden layers. In other words, as the number of hidden layers increases, the classification accuracy actually decreases. This phenomenon is called the vanishing gradient problem.
[0046] Weight initialization: The essence of deep learning model training is updating the weights. However, this cannot be done at the beginning of training; each parameter needs an initial value. After weight initialization, the neural network can iteratively update the weight parameters w to achieve better performance.
[0047] Backpropagation algorithm: a common method used in conjunction with optimization methods (such as gradient descent) to train artificial neural networks.
[0048] Two-stage strategy: first signal processing, then neural network training.
[0049] Explainable machine learning: In machine learning tasks, simply identifying a machine learning model that optimizes predictive performance is far from sufficient. Interpretability and reliability are hallmarks of a high-performing model.
[0050] Input channel: Several two-dimensional information was input.
[0051] The beneficial effects of this invention are as follows: The wavelet domain adaptive network based on physical information integrates interpretable wavelet knowledge into a dual-stream convolutional layer with independent weights to handle cross-machine diagnostic tasks. Optimized Laplace or Morlet wavelet weights with rich information are used to update the weights of the first layer of the CNN. The scale and translation factors with specific physical interpretations are constrained by the convolutional kernel parameters, while a smoothing auxiliary scale factor is considered to ensure consistency with the neural network weights. This improves domain transferability and enhances the effectiveness and accuracy of cross-machine diagnostics.
[0052] The advantages of additional aspects of the invention will be set forth more clearly in the following description or will be learned by practice of the invention. Attached Figure Description
[0053] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0054] Figure 1 This is a schematic diagram illustrating the source domain and target domain transformation according to an embodiment of the present invention.
[0055] Figure 2 This is a schematic diagram of the wavelet domain adaptive network structure based on physical information as described in an embodiment of the present invention.
[0056] Figure 3 This is a schematic diagram of the weight initialization results widely used in the rotor-gear integrated fault test platform described in this embodiment of the invention.
[0057] Figure 4 This is a schematic diagram of the wavelet transform result of the high-speed train traction motor bearing platform according to an embodiment of the present invention.
[0058] Figure 5 This is a schematic diagram of the fault diagnosis algorithm flow described in an embodiment of the present invention.
[0059] Figure 6 This is a schematic diagram of the wavelet weighted performance of typical indicators.
[0060] Figure 7The A-distance is a typical indicator for task C→F3 as described in this embodiment of the invention.
[0061] Figure 8 This is a schematic diagram of the SWK initialization results before and after training, as described in an embodiment of the present invention.
[0062] Figure 9 This is a schematic diagram of the cumulative frequency band for SWK initialization according to an embodiment of the present invention.
[0063] Figure 10 This diagram illustrates the kurtosis values and associated frequency bands of the signal under different states. The states are: normal state (optimal center frequency 39.06kHz, bandwidth 3.13kHz); inner ring fault (optimal center frequency 41.67kHz, bandwidth 16.67kHz); outer ring fault (optimal center frequency 39.58kHz, bandwidth 4.17kHz); and ball fault (optimal center frequency 46.88kHz, bandwidth 6.25kHz). Detailed Implementation
[0064] Embodiments of the present invention are described in detail below, examples of which are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.
[0065] It will be understood by those skilled in the art that, unless otherwise defined, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains.
[0066] It should also be understood that terms such as those defined in general dictionaries should be understood to have meanings consistent with their meanings in the context of the prior art, and should not be interpreted in an idealized or overly formal sense unless defined as here.
[0067] Those skilled in the art will understand that, unless specifically stated otherwise, the singular forms “a,” “an,” “the,” and “the” used herein may also include the plural forms. It should be further understood that the term “comprising” as used in this specification means the presence of the stated features, integers, steps, operations, elements, and / or components, but does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, and / or groups thereof.
[0068] In the description of this specification, references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the present invention. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of those different embodiments or examples.
[0069] To facilitate understanding of the present invention, the present invention will be further explained and described below with reference to the accompanying drawings and specific embodiments. However, the specific embodiments do not constitute a limitation on the embodiments of the present invention.
[0070] Those skilled in the art should understand that the accompanying drawings are merely schematic diagrams of embodiments, and the components in the drawings are not necessarily essential for implementing the present invention.
[0071] This invention provides a cross-device transfer diagnostic method for train traction motor bearing condition detection. It assumes that interpretable weight initialization based on signal mechanisms can improve domain transferability and promote the learning of domain-invariant features. By exploring the potential value of reasonable initialization of first-layer weights in deep models, it extends previous work and expands the scope to cross-machine diagnostics. This invention emphasizes a new paradigm: using a wide first-layer kernel initialized with physically interpretable weights can effectively promote cross-machine diagnostics. From an application perspective, interpretable wavelet weights are used for cross-machine diagnostics of high-speed train traction motors. Existing public datasets, datasets based on custom test benches, simulation-based datasets, and datasets based on traction motor platforms provide a foundation for solving unsupervised cross-machine diagnostics. This invention is the first to apply wavelet weight initialization, which has interpretability and consistency with other weights, to the more challenging cross-machine diagnostic scenario. The main contribution of this invention is the proposal of a physically-based cross-machine diagnostic network that uses optimized wavelet weights to initialize the first-layer weights, improving domain transferability from a signal processing perspective. This aggregation leverages the reliability of signal processing, improving the robustness of the data-driven model, while utilizing the powerful expressive power of the data-driven model to alleviate the complexity of parameter selection in signal processing algorithms. Therefore, a smoothed wavelet kernel is designed as the wavelet weight, and a smoothing factor is introduced to adjust the consistency between the scaling factor *s* and the translation factor *u*, thus solving the gradient vanishing problem. Then, the kernel size and the number of output channels are correlated with the parameters *s*, *u*, and time *t*, giving the convolution parameters wavelet physical meaning and extracting more valuable and informative features. For the sampling frequency, a non-shared weight dual-stream architecture is designed. Through optimized non-shared wavelet weights, data from different machines are transformed into the same common space to promote domain transferability. Finally, a first-layer transfer learning fault diagnosis paradigm with a wide kernel and optimized wavelet weight initialization is designed. This can enhance domain transferability and further effectively promote cross-machine diagnosis.
[0072] Example 1
[0073] In this embodiment 1, a cross-device migration diagnostic system for train traction motor bearing condition detection is first provided, including: an acquisition module for acquiring vibration signals of the traction motor system bearing; and a diagnostic module for processing the acquired vibration signals of the traction motor system bearing using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: collecting vibration signal samples from the traction motor system bearing; initializing the weights of the first layer of the neural network to wavelet weights, i.e., initializing the weights of the first layer using wavelet weight initialization; inputting the source domain and target domain into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source domain and the target domain; training the network using the backpropagation algorithm, stopping the training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0074] Using the above system, a cross-equipment transfer diagnosis method for train traction motor bearing condition detection can be realized, including: acquiring vibration signals of the traction motor system bearings; processing the acquired vibration signals of the traction motor system bearings using a pre-trained diagnostic model to obtain bearing fault damage detection results; wherein, training the diagnostic model includes: collecting vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights, i.e., initializing the weights of the first layer using wavelet weight initialization; inputting the source domain and target domain into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source domain and the target domain; training the network through the backpropagation algorithm, and stopping the training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0075] The specific training algorithm for the network is as follows: Vibration signal samples are collected from the bearings and other equipment of the traction motor system. The weights of the first layer of the neural network are initialized using wavelet weights. The source and target domains are input into the network respectively, and the classification loss of the source domain and the metric difference loss between the source and target domains are calculated. The network is trained using the backpropagation algorithm. Training stops when the maximum number of iterations is reached, and the relevant parameters of the model are saved. Samples from the traction motor are input into the target domain model to test the model's performance.
[0076] In this embodiment, interpretable wavelet weights are used to perform cross-machine diagnostics on the traction motor bearings of high-speed trains. Annotating large-scale traction motor bearing data is expensive and time-consuming, but there is public, simulated, and self-collected data, which provides a foundation for solving unsupervised cross-machine diagnostics.
[0077] like Figure 1 As shown, wavelet weighted filters are used to extract data from the source domain. Extract transferable features and map them to the unsupervised target domain. The source domain dataset includes three sources: public, self-collected, and synthetic. Each data source follows a specific pattern. correspond Z R This represents the label space. Similarly, the target domain is collected from the traction motors of high-speed trains, following... The distribution of .
[0078] WIDAN's framework, such as Figure 2 As shown, the first-layer weights are specific to both the source and target domains, with no weight sharing. The dual-stream bottleneck layer utilizes smooth, enhanced wavelet weights without weight sharing to transform the data from both the source and target domains into a common feature representation space.
[0079] The intermediate bottleneck and classifier share weights. Using a WDCNN with batch normalization and ReLU activation, the extracted features are fed into a metric function to achieve implicit domain alignment. Finally, Softmax is applied to output the fault category. During testing, the test set of the target domain is input into the trained model to evaluate the wavelet weight initialization performance.
[0080] The design principle of the improved wavelet filter is as follows:
[0081] Assume there are N input channels i The output channels are N. k The sampling frequency is f, and the kernel size is K. For example... Figure 3 As shown, the initialization of Kaiming and Xavier has several key features.
[0082] Condition I: The unit axis of the filter corresponds to a single value.
[0083] Condition II: Initial values follow a distribution between (-1.0 and 1.0).
[0084] SWK (a module in WIDAN) is a more general variant of WCK (a module in WaveKernelNet) designed to address the limitations of WCK. As shown in Table 1, WCK empirically sets its parameters to fixed values, which not only ignores differences in sampling frequencies across different datasets but also cuts off the correlation with the convolutional kernel. Therefore, SWK addresses these limitations, achieving a series of groundbreaking improvements.
[0085] by Figure 4 N k For example, with K=64 and K=300. Figure 4 a and Figure 4 As shown in d (from WCK), it can be seen that the Laplace wavelet's value range is (-10) / (10). 6 10 6 First, it does not meet condition II, and cannot guarantee that the order of magnitude of the value range of the first layer is consistent with the weights of other layers. Second, the Morlet wavelet takes values close to 0 at both the front and back ends, which does not meet condition I, and therefore cannot extract sufficient features.
[0086] First, for t∈(0,K-1), step=K indicates that one time step represents one unit convolution kernel. This establishes the correlation between time t and filter length K, thus obtaining… Figure 4 b and Figure 4e. The above operations give the convolution kernel size a temporally physically interpretable meaning. However, conditions I and II are still not satisfied. This is because the values of Laplace and Morlet approach 0 at a considerable number of time locations, hindering the extraction of feature representations, where the values of the range Laplace wavelet fluctuate wildly. More seriously, this will lead to gradient vanishing.
[0087] To address this issue, s∈(1,10) and u∈(1,10) are inconsistent. From the output channel perspective, s∈(1,10), therefore it is updated to s∈(0,N). k In this way, the alignment of the scaling and translation factors is further correlated with the output channels, strengthening their relevance to the convolution kernel. Through the above operations, the convolution channels are given interpretable meanings for s and u.
[0088] So, the basic dictionary of wavelets Corresponding to u i ,s i (i = 0, 1, 2, ..., N) k At the same time:
[0089]
[0090] Where s∈(0,N) k ), step = K; u ∈ (0, N) k Step = K. However, due to inappropriate wavelet transform scale, the amplitude variation of the Laplace wavelet dictionary is too large, leading to gradient vanishing. Therefore, a smoothing factor ζ is introduced to form a smoothed and enhanced wavelet kernel, as shown in the following equation, where the value of ζ is a very critical determinant of feature transferability.
[0091]
[0092] Simultaneously, the exponential components of the wavelet basis functions with the base e are modified using the Sigmoid function, restricting them to (0,1), to obtain ψ. u,s (t)∈(-1,1), an operation similar to data normalization, is used to improve the consistency between wavelet weights and neural network weight initialization. Through a series of improvements, conditions I and II are satisfied, while also possessing the wavelet interpretability property.
[0093] Laplace Xiaobo and Morlet wavelets Each as Figure 4 c and Figure 4 As shown in f:
[0094]
[0095]
[0096] in ξ Let A be the viscous damping ratio. A is a wavelet normalization function. t represents the time parameter. C is the normalization constant.
[0097] Table 1 highlights the main differences between SWK and WCK. ζ is an important parameter for adjusting the smoothness of the scaling factor, and in turn, it affects the wavelet energy distribution.
[0098] Table 1 compares the parameters of WCK and SWK.
[0099]
[0100] The initialization of wavelet weights for cross-machine transfer is as follows:
[0101] Based on this understanding, a robust weight initialization method is provided to mitigate the inter-domain probability distribution bias in different mechanical fault data to some extent.
[0102] For the first convolutional layer, the response y1 is:
[0103] y1 = W1x + b1
[0104] Where W1 is the weight and b1 is the bias.
[0105] Replace the original weights with improved wavelet weights: The forward propagation mechanism can be represented as:
[0106]
[0107]
[0108] In the formula, l represents the l-th layer. Specifically,
[0109] The optimization objective analysis is as follows:
[0110] Furthermore, as shown in the formula for the forward propagation mechanism, the weights of a deep neural network are all influenced by the weights of the first layer, highlighting the importance of proper weight initialization in the first layer. According to transfer learning theory, minimizing inter-domain differences helps improve transferability; therefore, proper initialization contributes to improved domain transferability. The backpropagation mechanism can be expressed as:
[0111]
[0112]
[0113] Where α represents learning, θ1 represents the parameters that need to be updated in the first layer, and L total This indicates the total loss.
[0114] In Table 1, another key difference between SWK and WCK is that WCK has two learnable parameters s and u, while this method has a learnable parameter ψ. u,s (t) Weights. The source domain dataset comes from laboratory or simulated scenarios. Cross-entropy loss is used to guide parameter updates; the cross-entropy loss L... cls for:
[0115] Where p(k) is the predicted distribution and q(k) is the actual distribution.
[0116] The key to this embodiment is not designing more precise statistical measures, but rather using any typical statistical measure, such as Maximum Mean Discrepancy (MMD), Sliced Wasserstein Discrepancy (SWD), CORAL, etc. The final loss function can be expressed as: L total =L cls +λL metric ; where L metric The loss is measured by the domain difference, and λ is a tradeoff coefficient that adjusts the proportion of the corresponding loss term during backpropagation.
[0117]
[0118] In this embodiment, the simulation examples and result analysis are provided as follows:
[0119] The datasets used are shown in Table 2:
[0120] Table 2 provides a detailed description of the dataset.
[0121]
[0122] The purpose of this embodiment is to solve the diagnostic problem of traction motors in high-speed trains; therefore, only F is used as the target domain. Each dataset is divided into samples of length 1024 using sliding sampling, for a total of 480K samples, of which 240K are from the source and target domains. A test set of 240K is drawn from the additional target domain (K is the number of classes). Multiple transfer tasks A, B, C, D, E are constructed to F0, F1, F2, F3.
[0123] A→F0 represents starting from A as the source domain and F0 as the target domain. Furthermore, these models are consistently implemented on PyTorch 1.12.0, using an NVIDIA Tesla V100 32GB GPU. WIDAN's parameters are 48324.
[0124] WIDAN was evaluated against several typical approaches: BN-WDCNN with no transfer learning strategy; typical methods based on statistical metrics: DCORAL, DJDA, and DSAN; adversarial methods: DANN; state-of-the-art methods for cross-machine diagnostics: DDNTL and CK-CNN; and interpretable models for this scenario, including Wavetkernelnet, DFAWNet, SincNet, GTFENet, and GaborNet.
[0125] All algorithms were rigorously configured according to the parameters specified in the respective papers to ensure fair comparison. Other parameter settings are shown in Table 3 based on experience. All experiments were performed five times, and the mean and standard deviation were taken. It is worth noting that the sampling frequency of dataset B is not publicly disclosed; therefore, Morlet wavelet weights were used when B was identified as either the source or target domain.
[0126] Table 3. Basic Hyperparameters of WIDAN
[0127]
[0128] BN-CNN indicates the absence of a two-stream module, metric loss, and wavelet weight initialization. Tables 4–7 show that without transfer learning, it is impossible to monitor the operating status of traction motor bearings from multiple data sources. In contrast, wavelet initialization significantly improves accuracy to over 99%, demonstrating that wavelet initialization and the two-stream transform layer can significantly reduce domain variability.
[0129] Table 4 compares the performance (%) of other methods on F0.
[0130]
[0131]
[0132] Table 5 compares the performance (%) of other methods on F1.
[0133]
[0134] Table 6. Performance comparison with other methods on F2 (%)
[0135]
[0136] Table 7. Performance comparison with other methods on F3 (%)
[0137]
[0138] Compared to other statistical metrics, relying solely on statistical metrics may lead to negative transfer due to the significant skew in the distribution of datasets from different machines, resulting in performance inferior to BN-CNN. The effectiveness of purely data-driven methods depends on certain given tasks; some tasks prove robust, while others perform poorly. Wavelet initialization is a flexible data preprocessing method. Unlike general data preprocessing, wavelet initialization is integrated into end-to-end WIDAN.
[0139] like Figure 6 As shown in the example (Task: A→F0), the impact on several widely used metrics was evaluated. It can be inferred that statistical data has a decisive impact on performance, but for poor metrics, WIDAN can significantly improve performance. It can improve performance by at least 10% or more. The results show that initializing wavelet weights does indeed reduce inter-domain variability.
[0140] The initialization of wavelet weights ensures the consistency of the neural network weight distribution while also maintaining interpretability. Table 8 shows that experiments with WIDAN significantly improve performance and are less sensitive to various datasets than models that prioritize interpretability alone. Although WIDAN performs worse than some models on certain tasks, it exhibits excellent stability and robustness, achieving an average accuracy improvement of 11.69% over SincNet and 37.94% over DFAWNet.
[0141] Table 8. Performance comparison of backbone network guided by signaling mechanism on F2 (%)
[0142]
[0143] From Table 9, B and C use Morlet and Laplace wavelets respectively, with ζ = 0.5. 21 and b 21 In a2( Figure 4 b) and b2 Figure 4 Based on e), a sigmoid is added. (w / o) represents the random initialization used, (0.3) represents the proportion of the test set, and a1 and b1 represent purely interpretable wavelets. The accuracy of recognition reflects the need for improvement at each step.
[0144] Table 9 Ablation Experiment Performance of SWK
[0145]
[0146] On the one hand, from a3 vs. a (w / o) and b3 vs. b (w / o)It can be seen that wavelet weight initialization can improve performance by more than 30% compared to random initialization. Purely physically interpretable models may suffer performance degradation or even training failure due to gradient update issues. On the other hand, SWK is more time-consuming than WCK, but less time-consuming than random initialization, achieving a trade-off between time and accuracy.
[0147] The smoothing factor ζ and the scaling factor s control the energy concentration of the wavelet, but ζ must be estimated based on the data, and there is currently no effective estimation solution. In this embodiment, the performance ζ∈[-0.5,0.5] is shown in Table 11 by fixing another hyperparameter to the optimal parameter. For task C→F3, robustness is achieved when α=-0.3,0.1,0.5. Obviously, choosing a suitable ζ is crucial. An inappropriate value will lead to negative migration and disrupt the similarity distribution of the data. Furthermore, different tasks correspond to different smoothing factors ζ, and their accurate evaluation will be a focus of future research. Apart from this, all other parameters are fixed.
[0148] Table 10 Accuracy of ζ under different conditions for Task C→F3
[0149]
[0150] A-distance is widely used to measure the similarity between two domains, and is defined as A(D). s D t ) = 2(1-2err(h)), where err(h) is the generalization error of the binary classifier in the source and target domains. Regarding accuracy, from Figure 6 It can be seen that the initial wavelet weights have an enhancing effect on various statistical indicators. Figure 7 It quantitatively demonstrates that it reduces inter-domain differences and promotes inter-domain similarity.
[0151] To ensure the reproducibility of the reported results, the performance was validated on three public datasets. As shown in Table 11, WIDAN exhibits better generalization ability and better robustness to different datasets. However, compared to the previous results on A, the performance is somewhat worse (A, B, C, D, E → F0, F1, F2, F3), indicating that it requires high-quality datasets. Furthermore, high-quality source domain data can improve the accuracy of transfer learning.
[0152] Table 11 compares the performance of other methods on datasets A, B, and C.
[0153]
[0154] like Figure 8 and Figure 9As shown, WIDAN establishes two-stream modules with different weights, facilitating the conversion of different data into a common pattern space. The proposed SWK initialization helps guide the search for optimal weights, thereby improving domain transferability. The trends of weight changes before and after training are generally similar.
[0155] In transfer learning analysis, the accuracy of the target domain is paramount. Therefore, focusing on the target domain, the cumulative frequency band initialized by SWK is determined. The center frequency range is 1.56kHz to 39.06kHz. From... Figure 10 It can be seen that the calculated center frequency, within the normal, inner loop, and outer loop bandwidths, is also close to the bandwidth of the rolling element. Therefore, guided by reasonable initial weights, the optimal frequency band can be identified.
[0156] Example 2
[0157] This embodiment 2 provides a non-transitory computer-readable storage medium for storing computer instructions. When executed by a processor, the computer instructions implement the cross-device migration diagnosis method for train traction motor bearing condition detection as described in embodiment 1. The method includes:
[0158] Acquire vibration signals from the bearings of the traction motor system;
[0159] The acquired vibration signals of the traction motor system bearings are processed using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0160] Example 3
[0161] This embodiment 3 provides a computer device, including a memory and a processor, wherein the processor and the memory communicate with each other, and the memory stores program instructions that can be executed by the processor. The processor calls the program instructions to execute the cross-device migration diagnosis method for train traction motor bearing condition detection as described in embodiment 1. The method includes:
[0162] Acquire vibration signals from the bearings of the traction motor system;
[0163] The acquired vibration signals of the traction motor system bearings are processed using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0164] Example 4
[0165] This embodiment 4 provides an electronic device, including: a processor, a memory, and a computer program; wherein, the processor is connected to the memory, and the computer program is stored in the memory. When the electronic device is running, the processor executes the computer program stored in the memory to cause the electronic device to execute instructions to implement the cross-device migration diagnosis method for train traction motor bearing condition detection as described in embodiment 1. The method includes:
[0166] Acquire vibration signals from the bearings of the traction motor system;
[0167] The acquired vibration signals of the traction motor system bearings are processed using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
[0168] In summary, existing research in the field of traction motor system monitoring rarely considers interpretable methods and systems based on transfer learning, which affects the accuracy of identification. The cross-device transfer diagnosis method and system for train traction motor bearing condition detection described in this invention provides an end-to-end, concise, and high-performance Physics-informed Wavelet Domain Adaptation Network (WIDAN), instead of designing domain difference statistics and complex network architectures. It integrates interpretable wavelet knowledge into dual-stream convolutional layers with independent weights to address the highly challenging cross-machine diagnostic tasks. Specifically, the first-layer weights of the CNN are updated using optimized Laplace or Morlet wavelet weights rich in information. Scale and translation factors with specific physical interpretations are constrained by the convolutional kernel parameters, while a smoothing auxiliary scale factor is considered to ensure consistency with the neural network weights. Overall evaluation confirms that WIDAN outperforms state-of-the-art models in multiple tasks. The results show that a wide first-layer kernel initialized with optimized wavelet weights can improve domain transferability and further effectively promote cross-machine diagnosis.
[0169] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0170] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0171] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0172] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment, whereby a series of operational steps are performed to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0173] While the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, this is not intended to limit the scope of protection of the present invention. Those skilled in the art should understand that, based on the technical solutions disclosed in the present invention, various modifications or variations that can be made by those skilled in the art without creative effort should be included within the scope of protection of the present invention.
Claims
1. A method for cross-equipment migration diagnosis of train traction motor bearing condition detection, characterized in that, include: Acquire vibration signals from the bearings of the traction motor system; A pre-trained diagnostic model is used to process the acquired vibration signals of the traction motor system bearings to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model; in the training network, the weights of the first layer are specific for the source and target domains, with no weight sharing; the dual-flow bottleneck layer utilizes the flatness of the non-weight-sharing... The smoothed wavelet weights transform the data from the source and target domains into a common feature representation space; the intermediate bottleneck and classifier share weights, and the extracted features are fed into a metric function using WDCNN with batch normalization and ReLU activation functions to achieve implicit domain alignment; finally, Softmax is applied to output the fault category; the weights of the first layer of the neural network are initialized with improved wavelet weights, including: establishing the correlation between time and filter length, making the convolution kernel size temporally physically interpretable; alignment scaling factors and translation factors, further correlated with the output channels, strengthening the correlation with the convolution kernel; introducing a smoothing factor to form a smoothed wavelet kernel, and simultaneously using the Sigmoid function to adjust the wavelet basis functions. The exponential components of the basis are corrected; the cross-machine migration wavelet weights are initialized to reduce the inter-domain probability distribution bias in different mechanical fault data. For the first convolutional layer, the response for: ; in It's weight. It's a deviation; Replace the original weights with improved wavelet weights: ; The forward propagation mechanism is as follows: ; ; In the formula Represented as the first layer; The backpropagation mechanism is represented as: ; in Indicates learning, This indicates the parameters that need to be updated in the first layer. Indicates the total loss; Using cross-entropy loss Guide parameters updated: ; in To predict the distribution, This represents the actual distribution.
2. The method for cross-equipment migration diagnosis of train traction motor bearing condition detection according to claim 1, characterized in that, The final loss function is expressed as: ; in, To measure loss for domain differences, To adjust the weighting factor for the proportion of the corresponding loss term during backpropagation: 。 3. A cross-equipment migration diagnostic system for train traction motor bearing condition detection, implementing the method as described in claim 1 or 2, characterized in that, include: The acquisition module is used to acquire the vibration signal of the bearing in the traction motor system; The diagnostic module is used to process the acquired vibration signals of the traction motor system bearings using a pre-trained diagnostic model to obtain bearing fault damage detection results. The training of the diagnostic model includes: acquiring vibration signal samples from the traction motor system bearings; initializing the weights of the first layer of the neural network to wavelet weights; inputting the source and target domains into the network respectively, calculating the classification loss of the source domain and the metric difference loss between the source and target domains; training the network using the backpropagation algorithm, stopping training when the maximum number of iterations is reached, and saving the relevant parameters of the model.
4. A non-transitory computer-readable storage medium, characterized in that, The non-transitory computer-readable storage medium is used to store computer instructions, which, when executed by a processor, implement the cross-device migration diagnosis method for train traction motor bearing condition detection as described in claim 1 or 2.
5. A computer device, characterized in that, The system includes a memory and a processor, which communicate with each other. The memory stores program instructions that can be executed by the processor, and the processor calls the program instructions to execute the cross-device migration diagnostic method for train traction motor bearing condition detection as described in claim 1 or 2.
6. An electronic device, characterized in that, include: The device includes a processor, a memory, and a computer program; wherein the processor is connected to the memory, and the computer program is stored in the memory. When the electronic device is running, the processor executes the computer program stored in the memory to cause the electronic device to execute instructions for implementing the cross-device migration diagnostic method for train traction motor bearing condition detection as described in claim 1 or 2.