A Method and System for Onboard Fault Diagnosis of External Gear Pumps Based on Multi-Teacher Knowledge
By employing a multi-teacher knowledge distillation method, and combining knowledge from physical, temporal, and spatial teacher modules, the problem of limited computational resources and data sources in fault diagnosis of airborne rotating components was solved. This enabled efficient and reliable fault monitoring under complex operating conditions, improving diagnostic accuracy and robustness.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HEFEI UNIV OF TECH
- Filing Date
- 2025-09-08
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies for fault diagnosis of airborne rotating components suffer from limitations in computational resources and data sources. In particular, the limitations of a single data source in multimodal data integration result in insufficient performance and flexibility of the diagnostic model, making it difficult to provide efficient and reliable fault monitoring under complex operating conditions.
A multi-teacher knowledge distillation method is adopted, which combines physical teacher modules, temporal teacher modules and spatial teacher modules, and uses pressure signals and vibration signals for fault diagnosis. The student module is trained to achieve efficient and reliable fault monitoring in environments with limited computing resources.
Under conditions of missing data and limited resources, it improves the accuracy and robustness of fault diagnosis, enables stable fault identification under complex working conditions, adapts to modal missingness and deployment limitations, and reduces model size and computational cost.
Smart Images

Figure CN121167451B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of artificial intelligence and fault signal diagnosis technology, and in particular to an airborne fault diagnosis method and system for external gear pumps based on multi-teacher knowledge. Background Technology
[0002] Gear pumps play a crucial role in aircraft fuel systems. Maintaining a stable fuel supply under various operating conditions, they are a core component for ensuring consistent fuel flow. Failure to operate a gear pump can lead to decreased engine performance and even jeopardize flight safety. If a gear pump stops functioning, fuel supply may be interrupted, resulting in thrust loss or engine shutdown. Pump failure can also cause fuel leaks or pressure fluctuations. Therefore, rapid and accurate fault detection of gear pumps is essential.
[0003] Currently, most fault diagnosis methods for rotating machinery rely on a single data type (usually vibration signals), resulting in a severe single-source dependency. Recently, researchers have begun employing multimodal data methods, effectively overcoming the limitations of fault diagnosis techniques dependent on a single data type. Multimodal data integrates information from various sources, including vibration, temperature, pressure, electrical signals, and acoustic emissions.
[0004] Multimodal methods can collect data from different sources simultaneously, but real-world airborne environments often rely solely on stress monitoring. The limitations of a single data source reduce the performance and flexibility of diagnostic models trained on multimodal inputs, significantly diminishing their effectiveness in real-world scenarios.
[0005] Developing reliable multimodal models that maintain good performance even with incomplete data has become a major challenge in fault diagnosis research. In recent years, knowledge distillation has emerged as an effective tool for knowledge transfer. This technique was initially used to compress and transfer knowledge from complex, high-performance teacher models to more compact and efficient student models. By transferring knowledge from teacher models trained on complete modal data to student models that can only acquire partial modalities, the distillation process can be applied to modality-deficient scenarios. Even with limited modal information, student networks can still indirectly absorb rich information through the distillation process. Knowledge distillation has proven effective in handling cross-modal tasks such as emotion recognition, but systematic research on condition monitoring and fault diagnosis of industrial equipment with modal deficiencies in aerospace scenarios remains insufficient. Furthermore, existing methods often rely on a single teacher model, limiting knowledge transfer across diverse representation spaces; especially in the field of airborne gear pump diagnosis, where computational resources are severely limited, existing model structures struggle to meet the requirements. Summary of the Invention
[0006] To overcome the limitations of computational resources and data sources in the diagnosis of airborne rotating components in the prior art, this invention proposes an airborne fault diagnosis method for external gear pumps based on multi-teacher knowledge, which can provide efficient and reliable fault monitoring under complex operating conditions.
[0007] This invention proposes an airborne fault diagnosis method for external gear pumps based on multi-teacher knowledge. In the dataset {pressure signal, vibration signal; fault label}, the student module is trained by distilling knowledge through a multi-teacher network, and then the student module is used to predict the fault category of the pressure signal.
[0008] The multi-teacher network includes at least two of the following: a physical teacher module, a temporal teacher module, and a spatial teacher module. The physical teacher module adds the characteristic coefficients obtained by encoding the pressure signal and the characteristic coefficients obtained by frequency domain extraction, and then diagnoses the fault category. The temporal teacher module extracts the temporal fusion features of the pressure signal and the vibration signal, and predicts the fault category based on the fusion feature sequence. The spatial teacher module diagnoses the fault category based on the spatial feature distribution of the pressure signal and the vibration signal.
[0009] During model training, the loss function is the sum of the loss functions of each teacher module and the knowledge distillation loss.
[0010] Preferably, the physics teacher module includes:
[0011] The physical guidance unit extracts frequency domain feature coefficients based on the frequency domain change value of the pressure signal;
[0012] Encoder, the coding feature coefficients that generate pressure signals;
[0013] The fault feature coefficients are obtained by superimposing the frequency domain feature coefficients and the coding feature coefficients in terms of dimensions;
[0014] The first classifier predicts classification based on fault feature coefficients.
[0015] Preferably, the physics teacher module also includes a signal reconstruction unit, which reconstructs the pressure signal based on the fault feature coefficients obtained by superimposing frequency domain feature coefficients and coding feature coefficients; the loss function of the physics teacher unit is the sum of signal reconstruction loss and classification loss;
[0016] Reconstructing pressure signals
[0017] in, Represents the fault characteristic coefficient. Let g be the prediction coefficient of the nth frequency component after frequency domain transformation of P(t); t is time, and I and J are the number of low-frequency components and the number of high-frequency components after frequency domain transformation, respectively; n and ζ n They represent The corresponding frequency component has its position and phase angle in the spectrum, where ω is the meshing frequency and c0 represents the offset term.
[0018] Preferably, the time-series teacher module diagnoses fault labels based on pressure signals and vibration signals; firstly, the pressure signal and vibration signal are divided into N equal segments and time-aligned; a multi-head attention mechanism is used to fuse the time-aligned signal segments to generate a fused feature sequence of length N; and the fault category is predicted based on the fused feature sequence.
[0019] Preferably, the time-series teacher module also predicts the next fusion feature of each subsequence based on the subsequences on the fusion feature sequence, generating a reconstructed feature sequence corresponding to the fusion feature sequence; predicts the fault category based on the reconstructed feature sequence; the reconstructed feature sequence lacks the first feature value relative to the fusion feature sequence; the loss function of the time-series teacher module is the sum of the signal reconstruction loss and the classification loss; the signal reconstruction loss uses the mean square error of the fusion feature sequence and the reconstructed feature sequence.
[0020] Preferably, a GPT network is used to generate a reconstructed feature sequence based on the fused feature sequence.
[0021] Preferably, the knowledge distillation loss L KD The calculation formula is:
[0022]
[0023] Where M represents the combination of teacher modules, Let σ be the tensor value of the classifier output corresponding to category c for the teacher module m∈M. m,c w represents the probability value of category c in the category probability distribution. m Let m be the weight of the teacher module. Let C be the category cross-entropy loss of the teacher module m, and C be the total number of fault categories.
[0024] Preferably, the student module consists of two convolutional units, an adaptive average pooling layer, and a fourth classifier connected in sequence. The convolutional unit consists of a one-dimensional convolutional layer, a batch normalization layer, an activation layer, and a max pooling layer connected in sequence.
[0025] Preferably, the dataset is constructed as follows: First, the vibration and pressure signals of the gear pump are collected, and the time axis of the sampled signals is corrected using a resampling method based on Fourier transform; then, the corrected vibration and pressure signals are time-aligned and windowed, and the mean of the first observation window is used as the reference value. The corrected vibration and pressure signals are then subtracted from the corresponding reference values to form dataset samples.
[0026] The present invention proposes an on-board fault diagnosis system for an external gear pump based on multi-teacher knowledge, comprising a memory and a processor. The memory stores a computer program, and the processor is connected to the memory. The processor is used to execute the computer program to realize the on-board fault diagnosis method for the external gear pump based on multi-teacher knowledge.
[0027] The advantages of this invention are:
[0028] (1) This invention proposes an airborne fault diagnosis method for external gear pumps based on multi-teacher knowledge. This method introduces multiple teacher models: a temporal teacher for modeling time-series patterns, a spatial teacher for focusing on spatial features, and a physical teacher for characterizing the working mechanism of the gear pump. By distilling the knowledge of these three teacher models, the student model enhances its ability to process complex information. Given the strict limitations on computing resources in airborne deployment, this invention employs a knowledge distillation strategy to transfer the diagnostic capabilities of the teacher models to a smaller, more efficient student network. This method significantly reduces model size and computational cost while maintaining diagnostic accuracy.
[0029] (2) This invention constructs a fault diagnosis framework that can integrate multi-spatial knowledge, adapt to modal deficiencies, and simultaneously meet deployment efficiency requirements, in order to meet the practical needs of vehicle-mounted fault diagnosis and address the challenges of modal deficiencies and deployment limitations. Structurally, this framework integrates physics-based modeling and deep learning; at the algorithmic level, it combines multi-teacher knowledge transfer and lightweight student model training. Its core lies in utilizing complete modal data from a ground-based laboratory environment through multiple teacher models, and then transferring the knowledge to a lightweight student network, aiming to solve the fault diagnosis problem of modal limitations and insufficient computing resources in vehicle-mounted scenarios.
[0030] (3) This invention demonstrates higher diagnostic accuracy and robustness under conditions of data deficiency and resource constraints, verifying its application potential in practical engineering scenarios such as aerospace. The physics teacher module proposed in this invention is particularly suitable for pressure pulsation signals, and the physics-driven multi-teacher strategy has good versatility. Attached Figure Description
[0031] Figure 1 This is a structural diagram of an onboard fault diagnosis model for an external gear pump based on multi-teacher knowledge proposed in this invention.
[0032] Figure 2 This is a flowchart of an onboard fault diagnosis method for an external gear pump based on multi-teacher knowledge proposed in this invention.
[0033] Figure 3 This is a structural diagram of the physics teacher module proposed in this invention;
[0034] Figure 4This is a visualization of the features of different models under working condition 1 in the embodiment. Detailed Implementation
[0035] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.
[0036] Reference Figure 1 , Figure 2 This embodiment proposes an onboard fault diagnosis model for external gear pumps based on multi-teacher knowledge. The model is trained by a multi-teacher model to obtain a student model, which then predicts the type of onboard fault of the external gear pump based on pressure signals.
[0037] The multi-teacher diagnostic model constructed in this embodiment includes a student module and at least two teacher modules selected from the physics teacher module, the space teacher module, and the temporal teacher module.
[0038] The physics teacher module performs a Fast Fourier Transform (FFT) on the measured pressure signal P(t) to convert it to the frequency domain. Then, it uses a physical information-guided masking operation to preserve the frequency components of interest, thus retaining the feature coefficients within the band of interest. Next, it uses the nonlinear mapping capability of the neural network to map the pressure signal onto the encoded features. Finally, it adds the feature coefficients obtained from the physical guidance mask and the encoder to achieve signal reconstruction as combined feature coefficients.
[0039] Reference Figure 3 The physics teacher module includes an encoder, a physics guidance unit, a signal reconstruction unit, and a first classifier.
[0040] The physical guidance unit is used to map the input pressure signal P(t) into characteristic coefficients {a}. i ,b j |1≤a≤I,1≤b≤J};I and J are empirical values, representing the number of low-frequency components and the number of high-frequency components after the frequency domain transformation of P(t), respectively; the mapping model is:
[0041]
[0042] f1(t) represents the low-frequency component of the frequency domain transform of P(t), f2(t) represents the high-frequency component of the frequency domain transform of P(t), and a i and α i b represents the amplitude and phase angle of the i-th low-frequency component, respectively. j c j dj and β j The amplitude, amplitude modulation coefficient, frequency modulation coefficient, and phase angle of the j-th high-frequency component are respectively; a i and b j c j d j ω is the learning coefficient; l is the meshing frequency; t is the number of gears; J is the time length. m Let B(t) represent the Bessel function; B(t) is the high-frequency component concentrated at jω, and C(t) is the high-frequency component concentrated at (j-1)ω or (j+1)ω.
[0043] The physical guidance unit guides physical information based on the frequency domain variation of the pressure signal P(t), and extracts frequency domain characteristic coefficients, i.e., characteristic coefficients {a i ,b j |1≤a≤I,1≤b≤J};
[0044] The encoder encodes the pressure signal P(t) to obtain the encoded feature coefficients. Specifically, the encoder can use a ResNet50 network.
[0045] Frequency domain feature coefficients and coding feature coefficients are superimposed to obtain fault feature coefficients. The signal reconstruction unit reconstructs the pressure signal based on the fault feature coefficients; the first classifier is based on the fault feature coefficients. Predictive classification;
[0046] Let the reconstructed signal of P(t) be denoted as These are the fault characteristic coefficients;
[0047]
[0048] in, The prediction coefficients for the nth frequency component of the frequency domain transform of P(t); For a i The predicted value, For b j The predicted value;
[0049]
[0050] Where F(·) represents the Fourier transform; R(·) represents coefficient decimation; φ(·) represents the encoder, which can specifically use a ResNet50 network; g n and ζ n They represent The corresponding frequency component has its position and phase angle in the spectrum, where ω is the meshing frequency and c0 represents the offset term.
[0051] Loss function of physics teacher unit For signal reconstruction loss and classification loss The sum; signal reconstruction loss Using the original input signal P(t) and the reconstructed signal L2 norm squared, classification loss The cross-entropy loss is calculated using the predicted label and the true label output by the first classifier.
[0052] The temporal teacher module includes: a stress coding unit, a vibration coding unit, a multi-head attention unit, a GPT decoding unit, and a second classifier.
[0053] The pressure encoding unit divides the pressure signal P(t) into N equal segments p. n P(t)={p n |1≤n≤N};
[0054] The vibration encoding unit divides the vibration signal V(t) into N equal segments v n V(t)={v n |1≤n≤N};
[0055] The multi-head attention unit employs a multi-head cross-attention mechanism to process each temporally aligned sample {p}. n ,v n Multimodal feature extraction and fusion are performed to generate fused feature f. n The second classifier classifies based on the fused feature sequence {fn|1≤m≤N}.
[0056] The GPT decoding unit is based on the fused feature sequence {f1,f2,…,f m-1 Predict the next feature 2≤m≤N.
[0057] Loss function of the time-series teacher module For signal reconstruction loss and classification loss The sum; signal reconstruction loss Using the fusion feature sequence {f2,f3,…,f m} and reconstructed feature sequences Mean squared error, classification loss The cross-entropy loss is calculated using the predicted label and the true label output by the second classifier.
[0058] The spatial teacher module consists of a spatial transformation network (STEM Network) and a third classifier. The STEM Network processes pressure and vibration signals to extract spatial features; the third classifier predicts the classification based on these spatial features. The loss function of the spatial teacher module... Cross-entropy loss using the predicted label and the true label output by the third classifier
[0059] The student module consists of two convolutional units, an adaptive average pooling layer (AdpAvgpoolld), and a fourth classifier connected sequentially. Each convolutional unit comprises a sequentially connected one-dimensional convolutional layer (Convld), a batch normalization layer (BatchNormld), an activation layer, and a max pooling layer (Maxpoolld); the activation layer can specifically use the ReLU activation function. The input to the student module extracts temporal features before and after the convolutional units. The batch normalization layer and the max pooling layer are combined to achieve feature compression and optimization. Finally, adaptive average pooling is used to obtain the final low-dimensional features for classification prediction. This ensures effective representation learning while also improving computational efficiency. The loss L of the student module... S The cross-entropy loss is calculated using the predicted label and the true label output by the fourth classifier.
[0060] The first, second, third, and fourth classifiers have the same structure. They first generate tensor values for each class based on the input data, and then activate the tensor value vectors to obtain the class probability distribution. In practice, the Softmax function can be used to activate the tensor value vectors.
[0061] Let M be the combination of teacher modules in the multi-teacher diagnostic model, and let the tensor value of the classifier output corresponding to the class c of teacher module m∈M be denoted as The probability value corresponding to category c in the category probability distribution is denoted as σ. m,c Let w denote the weight of the teacher module m. m Knowledge distillation loss is denoted as L. KD ;
[0062]
[0063] Where C represents the total number of fault categories; For the class cross-entropy loss of teacher module m; y c Let y be the binary number representing category c in the true label, i.e., the true label is category c. c =1, otherwise y c =0.
[0064] In this embodiment, a multi-teacher diagnostic model is first trained on a ground dataset {pressure signal, vibration signal; fault label} until convergence. Then, the student model is extracted as a student model to predict the airborne fault category of the external gear pump. That is, the student model predicts the fault category based on the air pressure signal.
[0065] Reference Figure 2 The training steps for the multi-teacher diagnostic model are as follows:
[0066] S1. Initialize the multi-teacher diagnostic model;
[0067] S2. Extract training samples from the ground dataset and iterate the multi-teacher diagnostic model. Then, extract validation samples and substitute them into the multi-teacher diagnostic model to calculate the loss function L. total It is the loss function of all teacher modules and the knowledge distillation loss L. KD sum.
[0068] S3, through the loss function L total Backpropagation updates the multi-teacher diagnostic model;
[0069] S4. Repeat steps S2 and S3 until the multi-teacher diagnostic model converges, then extract the student module as the student model.
[0070] The following specific embodiments are used to verify the various experimental models provided by the present invention. The experimental models are student models trained using different multi-teacher diagnostic models.
[0071] In this embodiment, different operating conditions are simulated using a fuel gear pump experimental platform. Data samples {pressure signal, vibration signal} are collected using an external gear pump fault testing device and multiple sensors, and fault labels are manually added to form a dataset {pressure signal, vibration signal; fault label}. In this embodiment, the fixed length of the data sample is 5120.
[0072] During the experiment, the data acquisition channel was configured in ADC mode, with an input voltage range of ±5V, a sampling frequency of 20.48kHz, and a maximum recording time of 60 seconds. Each fault condition was repeated three times, with each acquisition lasting at least 30 seconds. Considering that environmental noise and changes in operating conditions might affect the data acquisition process, the pressure pulsation signal was preprocessed before the experiment to ensure data reliability. The preprocessing included the following two steps:
[0073] Correction for speed fluctuations: A resampling method based on Fourier transform is used to calibrate and stabilize the signal, and the time axis is corrected;
[0074] Drift Compensation: A mean-based correction method is introduced to correct drift in the pressure signal. Specifically, the mean of the first batch of observed samples is used as a benchmark, and subsequent data collected are retained after subtracting this benchmark value. Through this correction, the offset effect caused by drift is successfully eliminated, resulting in more consistent and accurate measurement results.
[0075] In this embodiment, the dataset covers eight different operating states, i.e., real labels, including one normal state, five single fault scenarios, and two composite fault scenarios.
[0076] The 8 real labels and 3 working conditions in this embodiment are detailed in Tables 1 and 2.
[0077] Table 1: Fault Categories
[0078]
[0079]
[0080] Table 2: Three Operating Conditions
[0081] Operating conditions Gear speed Gear operating frequency Operating Condition 1 600r / min 10Hz Operating Condition 2 1050r / min 17.5Hz Operating Condition 3 1500r / min 25Hz
[0082] In this embodiment, the collected dataset was divided into a training set, a validation set, and a test set in a 6:2:2 ratio. The training set contained 576 samples, while the validation and test sets each contained 192 samples. The learning rate was set to 0.001, and an early stopping mechanism was introduced to prevent overfitting. The Adam optimizer was used during training.
[0083] In this embodiment, four experimental models and three comparative models were trained and tested. The models were trained on training and validation sets, and then used to predict fault categories based on individual stress signals on the test set.
[0084] The four experimental models in this embodiment are as follows:
[0085] Experimental Model 1: The multi-teacher diagnostic model only includes temporal teacher modules and spatial teacher modules; during training, the total loss function L... total For calculations, refer to formulas (10-1) to (13-1):
[0086]
[0087] Experimental Model 2: The multi-teacher diagnostic model only includes the physics teacher module and the space teacher module; during training, the total loss function L total For calculations, refer to formulas (10-2) to (13-2):
[0088]
[0089]
[0090] Experimental Model 3: The multi-teacher diagnostic model only includes the temporal teacher module and the physics teacher module; during training, the total loss function L total For calculations, refer to formulas (10-3) to (13-3):
[0091]
[0092] Experimental Model 4: The multi-teacher diagnostic model includes a physical teacher module, a temporal teacher module, and a spatial teacher module; during training, the total loss function L... totalFor the calculation, please refer to formulas (10-4) to (13-4) and formula (14):
[0093]
[0094] Among them, y c Let y be the binary number representing category c in the true label, i.e., the true label is category c. c =1, otherwise y c =0; σ p,c and These are the tensor values and probabilities of category c output by the physics teacher module, respectively. For the cross-entropy loss of the physics teacher module; σ t,c and These are the tensor values and probabilities of category c output by the time-series teacher module, respectively. For the cross-entropy loss of the time-series teacher module; σ s,c and These are the tensor values and probabilities of category c output by the spatial teacher module, respectively. The cross-entropy loss is for the spatial teacher module.
[0095] Specifically:
[0096]
[0097]
[0098] In this embodiment, experimental models 1-4 are all trained on the ground dataset, and then the student module is extracted as the experimental model for testing on the aerial dataset.
[0099] In this embodiment, the comparison model uses DAFT network, MMTM network, MedFuse and independent student module.
[0100] In this embodiment, the test results for each model are shown in Table 3.
[0101] Table 3: Model Test Accuracy under Various Working Conditions
[0102] Model Abbreviation Operating Condition 1 Operating Condition 2 Operating Condition 3 DAFT (Comparison Model 1) 0.828 0.989 0.849 MMTM (Comparison Model 2) 0.878 0.919 0.732 MedFuse (Comparison Model 3) 0.708 0.776 0.789 S (Comparison Model 4) 0.750 0.682 0.833 NOphy (Experimental Model 1) 0.850 0.979 0.843 NOtem (Experimental Model 2) 0.850 0.974 0.875 NOspa (Experimental Model 3) 0.891 0.875 0.713 ALL (Experimental Model 4) 0.979 1 0.948
[0103] Table 3 shows that in the configuration using only the student model S, although the model receives the same pressure signal input as the physics teacher, it fails to utilize the multimodal information from the vibration signal, resulting in an overall low accuracy. Conversely, comparing experimental models 1-4 and model 3 reveals that the student model guided by knowledge distilled from the teacher network exhibits a consistent and significant performance improvement across all three operating conditions. These results indicate that the student model struggles to extract effective multimodal features independently and is prone to failure in complex environments without distillation support—failing to meet the robustness and accuracy requirements of practical applications.
[0104] When the spatial teacher module is removed, the student model can still benefit from the physical and temporal teachers, acquiring multimodal knowledge. However, under condition 3, the accuracy drops significantly. This highlights the unique role of the spatial teacher in capturing local spatial structures in vibration and pressure signals; its absence severely impacts the model's ability to identify complex fault modes.
[0105] When comparing the complete model (ALL) with a configuration where any teacher network is removed, a consistent decrease in diagnostic accuracy is observed. This observation underscores the necessity of the proposed multi-teacher structure, which jointly utilizes physical, temporal, and spatial knowledge. By distilling the knowledge of the temporal, spatial, and physical teachers into a student model that uses only stress signals, the proposed framework significantly reduces performance fluctuations across three operating conditions. This demonstrates that the method not only compensates for modal absences but also achieves stable and robust fault identification capabilities across multiple operating conditions.
[0106] In this embodiment, t-SNE is used to perform dimensionality reduction and visualization analysis of the test sample features extracted by the model. Figure 4 The fault characteristics learned under operating condition 1 are shown. From Figure 4 It can be seen that in the test results with NO phy and NO tem, some fault features are aliased; when using only the student network S, the fault feature aliasing problem is more serious, with only six faults being partially distinguished, and the feature representation is incomplete. This indicates that the features extracted by the model cannot completely reconstruct the fault state, and its performance is insufficient without distillation support. However, the fault feature discrimination ability of the student model and all experimental models is better than DAFT, MMTM, and MedFuse.
[0107] The ALL model can clearly distinguish eight types of fault states and form clear clusters in the feature space, demonstrating good classification performance.
[0108] Of course, those skilled in the art will recognize that the present invention is not limited to the details of the exemplary embodiments described above, but also includes the same or similar structures that can be implemented in other specific forms without departing from the spirit or essential characteristics of the invention. Therefore, the embodiments should be considered illustrative and non-limiting in all respects, and the scope of the invention is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention. No reference numerals in the claims should be construed as limiting the scope of the claims.
[0109] Furthermore, it should be understood that although this specification describes embodiments, not every embodiment contains only one independent technical solution. This narrative style is merely for clarity. Those skilled in the art should consider the specification as a whole, and the technical solutions in each embodiment can also be appropriately combined to form other embodiments that can be understood by those skilled in the art.
[0110] The technologies, shapes, and structures not described in detail in this invention are all known technologies.
Claims
1. A method for diagnosing airborne faults in an external gear pump based on multi-teacher knowledge, characterized in that, Using a dataset {pressure signal, vibration signal; fault label}, student modules are trained by knowledge distillation through a multi-teacher network, and then the student modules are used to predict the fault category of the pressure signal. The multi-teacher network includes at least two of the following: a physical teacher module, a temporal teacher module, and a spatial teacher module. The physical teacher module adds the characteristic coefficients obtained by encoding the pressure signal and the characteristic coefficients obtained by frequency domain extraction, and then diagnoses the fault category. The temporal teacher module extracts the temporal fusion features of the pressure signal and the vibration signal, and predicts the fault category based on the fusion feature sequence. The spatial teacher module diagnoses the fault category based on the spatial feature distribution of the pressure signal and the vibration signal. During model training, the loss function is the sum of the loss functions of each teacher module and the knowledge distillation loss; The physics teacher module includes: The physical guidance unit extracts frequency domain feature coefficients based on the frequency domain change value of the pressure signal; Encoder, the coding feature coefficients that generate pressure signals; The fault feature coefficients are obtained by superimposing the frequency domain feature coefficients and the coding feature coefficients in terms of dimensions; The first classifier predicts classification based on fault feature coefficients; The temporal teacher module diagnoses fault labels based on pressure and vibration signals. First, the pressure and vibration signals are divided into N equal segments and time-aligned. A multi-head attention mechanism is used to fuse the time-aligned signal segments to generate a fused feature sequence of length N. The fault category is predicted based on the fused feature sequence. The spatial teacher module consists of a spatial transformation network and a third classifier. The spatial transformation network processes pressure and vibration signals to extract spatial features; the third classifier predicts the classification based on the spatial features.
2. The airborne fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in claim 1, characterized in that, The physics teacher module also includes a signal reconstruction unit, which reconstructs the pressure signal based on the fault feature coefficients obtained by superimposing frequency domain feature coefficients and coding feature coefficients; the loss function of the physics teacher unit is the sum of signal reconstruction loss and classification loss; Reconstructing pressure signals ; in, Represents the fault characteristic coefficient. , Let be the prediction coefficient of the nth frequency component after frequency domain transformation of P(t); t is time, and I and J are the number of low-frequency components and the number of high-frequency components after frequency domain transformation, respectively. and They represent The corresponding frequency components are located in the frequency spectrum and their phase angles, where ω is the meshing frequency. This indicates the offset item.
3. The on-board fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in claim 1, characterized in that, The temporal teacher module also predicts the next fusion feature of each subsequence based on the subsequences on the fusion feature sequence, generating the reconstructed feature sequence corresponding to the fusion feature sequence; predicts the fault category based on the reconstructed feature sequence; the reconstructed feature sequence lacks the first feature value relative to the fusion feature sequence; the loss function of the temporal teacher module is the sum of the signal reconstruction loss and the classification loss; the signal reconstruction loss uses the mean square error of the fusion feature sequence and the reconstructed feature sequence.
4. The on-board fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in claim 3, characterized in that, The GPT network is used to generate reconstructed feature sequences based on fused feature sequences.
5. The airborne fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in claim 1, characterized in that, Knowledge distillation loss The calculation formula is: Where M represents the combination of teacher modules, Let c be the tensor value of the classifier output corresponding to the class c of the teacher module m∈M. w represents the probability value of category c in the category probability distribution. m Let m be the weight of the teacher module. Let C be the category cross-entropy loss of the teacher module m, and C be the total number of fault categories.
6. The on-board fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in claim 1, characterized in that, The student module consists of two convolutional units, an adaptive average pooling layer, and a fourth classifier connected sequentially. The convolutional unit consists of a sequentially connected one-dimensional convolutional layer, a batch normalization layer, an activation layer, and a max pooling layer.
7. The on-board fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in any one of claims 1-6, characterized in that, The dataset was constructed as follows: First, vibration and pressure signals of the gear pump were collected, and the time axis of the sampled signals was corrected using a resampling method based on Fourier transform. Then, the corrected vibration and pressure signals were time-aligned and windowed. The mean of the first observation window was used as the reference value, and the corrected vibration and pressure signals were subtracted from the corresponding reference values to form the dataset samples.
8. An onboard fault diagnosis system for an external gear pump based on multi-teacher knowledge, characterized in that, It includes a memory and a processor, wherein the memory stores a computer program, the processor is connected to the memory, and the processor is used to execute the computer program to implement the on-board fault diagnosis method for external gear pumps based on multi-teacher knowledge as described in any one of claims 1-7.