A fault diagnosis method, control device, terminal and storage medium
By constructing a multi-level domain adversarial convolutional neural network (MDACNN), the problem of the difference in the distribution of training and testing data for deep learning models under different working conditions was solved, and the model trained on source domain data was able to effectively classify faults on target domain data.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING INST OF TECH
- Filing Date
- 2022-08-31
- Publication Date
- 2026-06-19
AI Technical Summary
The failure of deep learning models due to differences in the distribution of training and testing data under different operating conditions makes it difficult to achieve effective domain transfer.
A multi-level domain adversarial convolutional neural network (MDACNN) is constructed, which includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. Adversarial learning is carried out by alternately optimizing training samples to reduce the difference in feature distribution between the source domain and the target domain.
This improves the accuracy of fault diagnosis under different operating conditions and ensures that the model trained on source domain data can be effectively generalized to fault classification of target domain data.
Smart Images

Figure CN115345255B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of fault detection technology, and in particular to a fault diagnosis method, control device, terminal and storage medium. Background Technology
[0002] Rotating mechanical structures, such as rolling bearings, are among the most commonly used and critical components in mechanical equipment. Due to complex and extreme working conditions, rolling bearings are highly prone to failure during operation. These failures not only affect the normal operation of the mechanical system and cause economic losses, but may even threaten personal safety. Therefore, real-time monitoring of bearing health and automatic, accurate diagnosis of faults occurring during rolling bearing operation are of great significance and value in ensuring the safe and reliable operation of mechanical equipment.
[0003] In recent years, deep learning has sparked a wave of intelligent fault diagnosis. However, the success of deep learning models relies on the consistency of training and testing data distributions. But in practical applications, due to complex and variable working conditions, sensor-collected data inevitably exhibits distribution differences. When training and testing data follow different distributions, the trained model is likely to fail. Therefore, achieving effective domain transfer to eliminate feature distribution differences between the training and testing domains has become a significant challenge. Summary of the Invention
[0004] The technical objective of this application is to provide a fault diagnosis method, control device, terminal, and storage medium to solve the problem that the pre-trained model cannot adapt to complex working conditions due to the distribution differences of data collected under different working conditions.
[0005] To address the aforementioned technical problems, embodiments of this application provide a fault diagnosis method, including:
[0006] Obtain the source domain dataset for known faults under the first operating condition, and the target domain dataset for unknown faults under the second operating condition;
[0007] A first preset number of source domain training samples are determined from the source domain dataset, and a second preset number of target domain training samples are determined from the target domain dataset.
[0008] A multi-level domain adversarial convolutional neural network (MDACNN) model is constructed. The MDACNN includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis on the source domain data. The multi-level domain discriminator is used to hierarchically determine whether the input sample is source domain data or target domain data.
[0009] During the training phase, the multi-level fault classifier and the multi-level domain discriminator in the MDACNN are alternately optimized adversarially based on the source domain training samples and the target domain training samples.
[0010] During the testing phase, test samples are input into the MDACNN to obtain diagnostic results for the test samples, which are untrained data in the target domain dataset.
[0011] Specifically, in the fault diagnosis method described above, the multi-level feature extractor includes: a third preset number of feature extraction units and a fourth preset number of feature output ports. The feature extraction units are stacked sequentially to extract fault features at different levels. The third preset number is greater than or equal to the fourth preset number.
[0012] The multi-level fault classifier includes: the fourth preset number of classification prediction units, and each of the classification prediction units is connected to the multi-level feature extractor through the feature output port;
[0013] The multi-level domain discriminator includes: the fourth preset number of domain discrimination units, and each of the domain discrimination units is connected to the multi-level feature extractor through the feature output port.
[0014] Furthermore, in the fault diagnosis method described above, the feature extraction unit includes: a convolutional layer, a batch normalization layer, an activation layer, and a pooling layer.
[0015] Both the classification prediction unit and the domain discrimination unit include a fully connected layer and a softmax layer.
[0016] Preferably, in the fault diagnosis method described above, a gradient inversion layer is provided between the multi-level neighborhood discriminator and the feature extractor.
[0017] Specifically, in the fault diagnosis method described above, during the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately subjected to adversarial optimization based on the source domain training samples and the target domain training samples. The optimization of the multi-level neighborhood discriminator includes:
[0018] The parameters of the multi-level fault classifier are fixed, and the multi-level domain discriminator is trained using a set of source domain training samples and target domain training samples. The parameters of the multi-level domain discriminator and the feature extractor are then updated. The domain discrimination loss objective function is:
[0019]
[0020] Among them, B k This represents the loss weight for the k-th domain;
[0021] This represents the loss value of the k-th neighborhood discrimination unit;
[0022] n+m represents the total number of training samples in the source domain and the target domain;
[0023] d i It is the domain label of the i-th training sample;
[0024] Let be the domain prediction value of the k-th domain discriminant unit for the i-th training sample.
[0025] Specifically, in the fault diagnosis method described above, during the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized based on the source domain training samples and the target domain training samples. Optimization of the multi-level fault classifier includes:
[0026] The parameters of the multi-level domain discriminator are fixed, and the multi-level fault classifier is trained using a set of source domain training samples. The parameters of the multi-level fault classifier and the feature extractor are then updated. The fault classification loss objective function is:
[0027]
[0028] Among them, A k This represents the classification loss weight of the k-th classification prediction unit;
[0029] This represents the loss value of the k-th classification prediction unit;
[0030] n represents the number of training samples in the source domain;
[0031] It is the classification label of the i-th source domain training sample at the k-th classification level;
[0032] Let be the classification prediction value of the model for the i-th source domain training sample at the k-th classification level.
[0033] Furthermore, the fault diagnosis method described above also includes:
[0034] During the training phase, the training is considered complete when the preset number of training sessions is reached.
[0035] Another embodiment of this application also provides a control device, including:
[0036] The first processing module is used to obtain the source domain dataset for known faults under the first operating condition, and the target domain dataset for unknown faults under the second operating condition.
[0037] The second processing module is used to determine a first preset number of source domain training samples from the source domain dataset and a second preset number of target domain training samples from the target domain dataset.
[0038] The third processing module is used to construct MDACNN, which includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis on the source domain data. The multi-level domain discriminator is used to hierarchically determine whether the input sample is the source domain data or the target domain data.
[0039] The fourth processing module is used to alternately optimize the multi-level fault classifier and the multi-level domain discriminator in the MDACNN based on the source domain training samples and the target domain training samples during the training phase.
[0040] The fifth processing module is used to input test samples into the MDACNN during the testing phase to obtain the diagnostic results of the test samples, wherein the test samples are untrained data in the target domain dataset.
[0041] Another embodiment of this application provides a terminal, including a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the fault diagnosis method as described above.
[0042] Another embodiment of this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the fault diagnosis method described above.
[0043] Compared with the prior art, the fault diagnosis method, control device, terminal and storage medium provided in this application have at least the following beneficial effects:
[0044] This application constructs an MDACNN comprising a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. It then alternately performs adversarial optimization on the multi-level fault classifier and the multi-level domain discriminator based on data from the source and target domain datasets under different operating conditions. While minimizing the fault classification loss to ensure accuracy, it maximizes the domain discriminator loss to prevent the multi-level domain discriminator from accurately determining whether a sample belongs to the source or target domain. This makes the features extracted by the feature extractor increasingly similar between the source and target domain data, thus enabling the MDACNN trained on the source domain data to be effectively generalized to fault classification on the target domain data. Attached Figure Description
[0045] Figure 1 This is a flowchart illustrating the fault diagnosis method of this application;
[0046] Figure 2 This is a schematic diagram of the structure of the MDACNN in this application;
[0047] Figure 3 This is a schematic diagram of the control device of this application. Detailed Implementation
[0048] To make the technical problems, technical solutions, and advantages of this application clearer, a detailed description will be provided below in conjunction with the accompanying drawings and specific embodiments. In the following description, specific details such as particular configurations and components are provided merely to aid in a comprehensive understanding of the embodiments of this application. Therefore, those skilled in the art should understand that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this application. Furthermore, for clarity and brevity, descriptions of known functions and structures have been omitted.
[0049] It should be understood that the phrase "one embodiment" or "an embodiment" throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of this application. Therefore, "in one embodiment" or "in an embodiment" appearing throughout the specification does not necessarily refer to the same embodiment. Furthermore, these specific features, structures, or characteristics can be combined in any suitable manner in one or more embodiments.
[0050] In the various embodiments of this application, it should be understood that the sequence number of each process described below does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
[0051] It should be understood that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.
[0052] In the embodiments provided in this application, it should be understood that "B corresponding to A" means that B is associated with A, and B can be determined based on A. However, it should also be understood that determining B based on A does not mean determining B solely based on A; B can also be determined based on A and / or other information.
[0053] See Figure 1 One embodiment of this application provides a fault diagnosis method, including:
[0054] Step S101: Obtain the source domain dataset for known faults under the first operating condition, and the target domain dataset for unknown faults under the second operating condition.
[0055] In some embodiments, a signal acquisition device monitors the rotating mechanical structure to be diagnosed for faults to obtain information reflecting the working and fault states of the rotating mechanical mechanism under different operating conditions, including but not limited to vibration signals. This information is then classified according to the operating conditions to construct a source domain dataset and a target domain dataset, facilitating subsequent training and testing based on the data in these datasets. It should be noted that when vibration signals are used as the source and target domain data, both datasets include data segments with at least one cycle length of the fault characteristic frequency.
[0056] In some embodiments, the source domain data is data with fault labels, and the target domain data is data without fault labels.
[0057] Step S102: Determine a first preset number of source domain training samples from the source domain dataset, and determine a second preset number of target domain training samples from the target domain dataset. It should be noted that the maximum values of the source domain training samples and target domain training samples in this step are determined based on the number of detections supported under the corresponding working conditions. Preferably, the first and second preset numbers are determined without further affecting the service life and damage level of the rotating machinery structure, and the values of the first and second preset numbers can be the same.
[0058] Step S103: Construct MDACNN, which includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level neighborhood discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis of the source domain data. The multi-level neighborhood discriminator is used to hierarchically determine whether the input sample is source domain data or target domain data. In this step, the constructed MDACNN includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level neighborhood discriminator (e.g., ...). Figure 2 As shown in the figure, minimizing the classification loss of the multi-level fault classifier during model training enables the model to achieve excellent diagnostic performance on source domain data, while maximizing the domain loss of the multi-level domain discriminator makes the source domain and target domain features extracted by the multi-level feature extractor increasingly similar, so that the multi-level fault classifier and the multi-level domain discriminator form an adversarial relationship.
[0059] Specifically, the multi-level feature extractor includes a third preset number of feature extraction units and a fourth preset number of feature output ports. The feature extraction units are stacked sequentially to extract fault features of different levels, thereby classifying the fault. In one specific embodiment, rolling bearing faults are classified into three progressive levels based on "bearing condition, fault location, and damage degree." The extracted fault features of different levels are output to a multi-level fault classifier and a multi-level neighborhood discriminator through the feature output ports, where fault diagnosis and neighborhood alignment are performed respectively. It should be noted that the third preset number is greater than or equal to the fourth preset number, meaning the number of output fault feature levels can be less than the number of feature extraction units. This allows for the output of different levels of fault features based on different detected structures, thereby improving the applicability of the MDACNN and facilitating personalized configuration.
[0060] Specifically, the multi-level fault classifier includes: the fourth preset number of classification prediction units, and each of the classification prediction units is connected to the multi-level feature extractor through the feature output port; thereby enabling the multi-level fault classifier to perform multi-level fault diagnosis based on the fault features of each level output by the multi-level feature extractor, so that the multi-level fault classifier can be trained in multiple levels during training to ensure the accuracy of fault diagnosis in the testing phase or actual use phase.
[0061] The multi-level domain discriminator includes a fourth preset number of domain discrimination units, and each of the domain discrimination units is connected to the multi-level feature extractor through the feature output port. This allows the multi-level domain discriminator to determine whether a data belongs to the source domain dataset or the target domain dataset based on the fault features output at each level, facilitating multi-level domain alignment. During training, the multi-level domain discriminator can be trained at multiple levels to eliminate feature distribution differences between the target domain data and the source domain data from multiple perspectives. This makes it easier to identify the target domain data as source domain data and classify it for faults during the testing or actual use phases, thereby improving the applicability of the MDACNN to different working conditions.
[0062] It should be noted that when constructing MDACNN, the network configuration is determined for the fault diagnosis task of the first working condition, and the initial multi-level feature extractor, multi-level fault classifier and multi-level domain discriminator are configured based on the network configuration. The network configuration includes, but is not limited to: the number of fault categories, the number of third and fourth categories, and the specific structural parameters of the feature extraction unit.
[0063] In one specific embodiment, the feature extraction unit includes: a convolutional layer, a batch normalization layer, an activation layer, and a pooling layer;
[0064] Both the classification prediction unit and the domain discrimination unit include a fully connected layer and a softmax layer.
[0065] Step S104: During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized adversarially based on the source domain training samples and the target domain training samples. That is, during the training phase of MDACNN, an alternating optimization approach is used to optimize the multi-level fault classifier and the multi-level neighborhood discriminator separately. Specifically, target domain training samples and source domain training samples are pre-assembled into a training sample group. During training, the multi-level fault classifier and the multi-level neighborhood discriminator are trained alternately on the constructed training sample group. This allows training in a relatively robust environment, which helps ensure the optimization effect in each training step. Furthermore, by alternating adversarial optimization, while minimizing the fault classification loss to maintain accuracy, the neighborhood discriminator can maximize the neighborhood discriminator loss, making it difficult for the multi-level neighborhood discriminator to accurately determine whether a sample belongs to the source domain or the target domain. This makes the features extracted by the feature extractor increasingly similar between the source domain data and the target domain data, thus enabling the MDACNN trained on the source domain data to be effectively generalized to fault classification of the target domain data.
[0066] It should be noted that when the source domain training samples and / or target domain training samples are few, the sample data can be expanded through methods such as overlapping sampling.
[0067] It should be noted that the multi-level feature extractor is optimized during the process of alternately optimizing the multi-level domain discriminator and the multi-level fault classifier.
[0068] Step S105: In the testing phase, test samples are input to the MDACNN to obtain the diagnostic results of the test samples. The test samples are untrained data from the target domain dataset. In this step, after the MDACNN has been trained in the training phase, it can be determined that the MDACNN can diagnose faults under the first and second operating conditions. To ensure the accuracy of the MDACNN in diagnosing faults under the second operating condition, test samples from the target domain dataset can be output to the MDACNN for testing. When the accuracy of the test results meets the requirements, it can be used in practice; or, the testing phase can be directly used as a practical application for actual fault diagnosis. In summary, this application constructs an MDACNN comprising a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. It then alternately performs adversarial optimization on the multi-level fault classifier and the multi-level domain discriminator based on data from the source and target domain datasets under different operating conditions. While minimizing the fault classification loss to ensure accuracy, it maximizes the domain discriminator loss to prevent the multi-level domain discriminator from accurately determining whether a sample belongs to the source or target domain. This makes the features extracted by the feature extractor increasingly similar between the source and target domain data, thus enabling the MDACNN trained on the source domain data to be effectively generalized to fault classification on the target domain data.
[0069] It should also be noted that, in the fault diagnosis method described above, a gradient inversion layer is provided between the multi-level neighborhood discriminator and the feature extractor.
[0070] Specifically, in the fault diagnosis method described above, during the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately subjected to adversarial optimization based on the source domain training samples and the target domain training samples. The optimization of the multi-level neighborhood discriminator includes:
[0071] The parameters of the multi-level fault classifier are fixed, and the multi-level domain discriminator is trained using a set of source domain training samples and target domain training samples. The parameters of the multi-level domain discriminator and the feature extractor are then updated. The domain discrimination loss objective function is defined as the weighted sum of the loss values of all domain discrimination units:
[0072]
[0073] Among them, B k This represents the loss weight for the k-th domain, which is between 0 and 1 (inclusive), and the sum of the weights of all domain discriminant units is 1.
[0074] The loss value of the k-th neighborhood discriminant unit is represented by the cross-entropy function, i.e.
[0075] n+m represents the total number of training samples in the source domain and the target domain;
[0076] d i It is the domain label of the i-th training sample;
[0077] Let be the domain prediction value of the k-th domain discriminant unit for the i-th training sample.
[0078] Specifically, in the fault diagnosis method described above, during the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized based on the source domain training samples and the target domain training samples. Optimization of the multi-level fault classifier includes:
[0079] The parameters of the multi-level domain discriminator are fixed, and the multi-level fault classifier is trained using a set of source domain training samples. The parameters of the multi-level fault classifier and the feature extractor are then updated. The fault classification loss objective function is defined as the weighted sum of the loss values of all classification prediction units:
[0080]
[0081] Among them, A k This represents the classification loss weight of the k-th classification prediction unit, which is between 0 and 1 (inclusive), and the sum of the weights of all classification prediction units is 1.
[0082] This represents the loss value of the k-th classification prediction unit, preferably using the cross-entropy function, i.e.
[0083] n represents the number of training samples in the source domain;
[0084] It is the classification label of the i-th source domain training sample at the k-th classification level;
[0085] Let be the classification prediction value of the model for the i-th source domain training sample at the k-th classification level.
[0086] Furthermore, the fault diagnosis method described above also includes:
[0087] During the training phase, the training is considered complete when the preset number of training sessions is reached.
[0088] In one specific embodiment of this application, the condition for determining the end of training is the number of training sessions. That is, by pre-determining a preset number of training sessions, the training is determined to end when the preset number of training sessions is reached. It should be noted that each training of the multi-level domain discriminator involves performing one multi-level domain discriminator optimization and one fault classifier optimization.
[0089] Optionally, the criteria for determining the end of training can also be that the value of the fault classification loss objective function reaches a first threshold and / or the value of the neighborhood discrimination loss objective function reaches a second threshold.
[0090] See Figure 3 Another embodiment of this application also provides a control device, including:
[0091] The first processing module 301 is used to obtain the source domain dataset of known faults under the first working condition and the target domain dataset of unknown faults under the second working condition.
[0092] The second processing module 302 is used to determine a first preset number of source domain training samples from the source domain dataset and a second preset number of target domain training samples from the target domain dataset.
[0093] The third processing module 303 is used to construct MDACNN, which includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis on the source domain data. The multi-level domain discriminator is used to hierarchically determine whether the input sample is the source domain data or the target domain data.
[0094] The fourth processing module 304 is used to alternately optimize the multi-level fault classifier and the multi-level domain discriminator in the MDACNN based on the source domain training samples and the target domain training samples during the training phase.
[0095] The fifth processing module 305 is used to input test samples into the MDACNN during the testing phase to obtain the diagnostic results of the test samples, wherein the test samples are untrained data in the target domain dataset.
[0096] Specifically, in the control device described above, the multi-level feature extractor includes: a third preset number of feature extraction units and a fourth preset number of feature output ports. The feature extraction units are stacked sequentially to extract fault features of different levels. The third preset number is greater than or equal to the fourth preset number.
[0097] The multi-level fault classifier includes: the fourth preset number of classification prediction units, and each of the classification prediction units is connected to the multi-level feature extractor through the feature output port;
[0098] The multi-level domain discriminator includes: the fourth preset number of domain discrimination units, and each of the domain discrimination units is connected to the multi-level feature extractor through the feature output port.
[0099] Furthermore, in the control device described above, the feature extraction unit includes: a convolutional layer, a batch normalization layer, an activation layer, and a pooling layer;
[0100] Both the classification prediction unit and the domain discrimination unit include a fully connected layer and a softmax layer.
[0101] Preferably, in the control device described above, a gradient inversion layer is provided between the multi-level neighborhood discriminator and the feature extractor.
[0102] Specifically, in the control device described above, the fourth processing module optimizes the multi-level neighborhood discriminator, including:
[0103] The first processing submodule is used to fix the parameters of the multi-level fault classifier, train the multi-level domain discriminator using a set of source domain training samples and target domain training samples, and update the parameters of the multi-level domain discriminator and the feature extractor, wherein the domain discrimination loss objective function is:
[0104]
[0105] Among them, B k This represents the loss weight for the k-th domain;
[0106] This represents the loss value of the k-th neighborhood discrimination unit;
[0107] n+m represents the total number of training samples in the source domain and the target domain;
[0108] d i It is the domain label of the i-th training sample;
[0109] Let be the domain prediction value of the k-th domain discriminant unit for the i-th training sample.
[0110] Specifically, in the control device described above, the fourth processing module optimizes the multi-level fault classifier, including:
[0111] The second processing submodule is used to fix the parameters of the multi-level domain discriminator, train the multi-level fault classifier using a set of source domain training samples, and update the parameters of the multi-level fault classifier and the feature extractor. The fault classification loss objective function is:
[0112]
[0113] Among them, A k This represents the classification loss weight of the k-th classification prediction unit;
[0114] This represents the loss value of the k-th classification prediction unit;
[0115] n represents the number of training samples in the source domain;
[0116] It is the classification label of the i-th source domain training sample at the k-th classification level;
[0117] Let be the classification prediction value of the model for the i-th source domain training sample at the k-th classification level.
[0118] Furthermore, the control device described above also includes:
[0119] The fifth processing module is used to determine the end of training when the number of training iterations reaches a preset number during the training phase.
[0120] The embodiments of the control device in this application are control devices corresponding to the embodiments of the above-described methods. All implementation means in the embodiments of the above-described methods are applicable to the embodiments of this control device and can achieve the same technical effect.
[0121] Another embodiment of this application provides a terminal, including a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the fault diagnosis method as described above.
[0122] Another embodiment of this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the fault diagnosis method described above.
[0123] Furthermore, reference numerals and / or letters may be repeated in different examples within this application. Such repetition is for the purpose of simplification and clarity and does not in itself indicate a relationship between the various embodiments and / or settings discussed.
[0124] It should also be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion.
[0125] The above description is the preferred embodiment of this application. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principles described in this application, and these improvements and modifications should also be considered within the scope of protection of this application.
Claims
1. A failure diagnosis method characterized by comprising: include: Obtain the source domain dataset for known faults under the first operating condition, and the target domain dataset for unknown faults under the second operating condition; A first preset number of source domain training samples are determined from the source domain dataset, and a second preset number of target domain training samples are determined from the target domain dataset. A multi-level domain adversarial convolutional neural network model MDACNN is constructed. The MDACNN includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis on the source domain data. The multi-level domain discriminator is used to hierarchically determine whether the input sample is the source domain data or the target domain data. During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized adversarially based on the source domain training samples and the target domain training samples. Specifically, the multi-level fault classifier is optimized by minimizing the fault classification loss, and the multi-level neighborhood discriminator is optimized by maximizing the neighborhood discrimination loss. During the testing phase, test samples are input into the MDACNN to obtain diagnostic results for the test samples, where the test samples are untrained data from the target domain dataset. The multi-level feature extractor includes: a third preset number of feature extraction units and a fourth preset number of feature output ports. The feature extraction units are stacked sequentially to extract fault features at different levels. The third preset number is greater than or equal to the fourth preset number. The multi-level fault classifier includes: the fourth preset number of classification prediction units, and each of the classification prediction units is connected to the multi-level feature extractor through the feature output port; The multi-level domain discriminator includes: the fourth preset number of domain discrimination units, and each of the domain discrimination units is connected to the multi-level feature extractor through the feature output port; During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately subjected to adversarial optimization based on the source domain training samples and the target domain training samples. The optimization of the multi-level neighborhood discriminator includes: The parameters of the multi-level fault classifier are fixed, and the multi-level domain discriminator is trained using a set of source domain training samples and target domain training samples. The parameters of the multi-level domain discriminator and the feature extractor are then updated. During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized based on the source domain training samples and the target domain training samples. The optimization of the multi-level fault classifier includes: The parameters of the multi-level domain discriminator are fixed, and the multi-level fault classifier is trained using a set of source domain training samples. The parameters of the multi-level fault classifier and the feature extractor are then updated.
2. The failure diagnosis method according to claim 1, characterized by, The feature extraction unit includes: a convolutional layer, a batch normalization layer, an activation layer, and a pooling layer; Both the classification prediction unit and the domain discrimination unit include a fully connected layer and a softmax layer.
3. The failure diagnosis method according to claim 1, characterized by, A gradient inversion layer is provided between the multi-level neighborhood discriminator and the feature extractor.
4. The fault diagnosis method according to claim 1, characterized in that, During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately subjected to adversarial optimization based on the source domain training samples and the target domain training samples. The optimization of the multi-level neighborhood discriminator further includes: The domain discrimination loss objective function is: in, This represents the loss weight for the k-th domain; This represents the loss value of the k-th neighborhood discrimination unit; Ntotal represents the total number of source domain training samples and target domain training samples; is the domain label of the ith training sample; is the domain prediction value of the i-th training sample for the k-th domain.
5. The failure diagnosis method according to claim 1, characterized by, During the training phase, the multi-level fault classifier and the multi-level neighborhood discriminator in the MDACNN are alternately optimized based on the source domain training samples and the target domain training samples. Optimization of the multi-level fault classifier further includes: The objective function for fault classification loss is: wherein, denotes the classification loss weight of the k-th classification prediction unit; Lkdenotes the loss value of the kth classification prediction unit; n represents the number of training samples in the source domain; It is the classification label of the i-th source domain training sample at the k-th classification level; is the classification prediction value of the model for the i-th source domain training sample at the k-th classification level.
6. The failure diagnosis method according to claim 1, characterized by, Also includes: During the training phase, the training is considered complete when the preset number of training sessions is reached.
7. A control device, characterized in that, include: The first processing module is used to obtain the source domain dataset for known faults under the first operating condition, and the target domain dataset for unknown faults under the second operating condition. The second processing module is used to determine a first preset number of source domain training samples from the source domain dataset and a second preset number of target domain training samples from the target domain dataset. The third processing module is used to construct MDACNN, which includes a multi-level feature extractor, a multi-level fault classifier, and a multi-level domain discriminator. The multi-level feature extractor is used to extract effective fault features from the target domain data and the source domain data. The multi-level fault classifier is used to perform hierarchical diagnosis on the source domain data. The multi-level domain discriminator is used to hierarchically determine whether the input sample is the source domain data or the target domain data. The fourth processing module is used to alternately optimize the multi-level fault classifier and the multi-level domain discriminator in the MDACNN based on the source domain training samples and the target domain training samples during the training phase. The fifth processing module is used to input test samples into the MDACNN during the testing phase to obtain the diagnostic results of the test samples, wherein the test samples are untrained data in the target domain dataset; The multi-level feature extractor includes: a third preset number of feature extraction units and a fourth preset number of feature output ports. The feature extraction units are stacked sequentially to extract fault features at different levels. The third preset number is greater than or equal to the fourth preset number. The multi-level fault classifier includes: the fourth preset number of classification prediction units, and each of the classification prediction units is connected to the multi-level feature extractor through the feature output port; The multi-level domain discriminator includes: the fourth preset number of domain discrimination units, and each of the domain discrimination units is connected to the multi-level feature extractor through the feature output port; The fourth processing module optimizes the multi-level neighborhood discriminator, including: The first sub-processing module is used to fix the parameters of the multi-level fault classifier, train the multi-level domain discriminator with a set of source domain training samples and target domain training samples, and update the parameters of the multi-level domain discriminator and the feature extractor. The fourth processing module optimizes the multi-level fault classifier, including: The first sub-processing module is used to fix the parameters of the multi-level domain discriminator, train the multi-level fault classifier using a set of source domain training samples, and update the parameters of the multi-level fault classifier and the feature extractor.
8. A terminal, characterized by comprising: It includes a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the fault diagnosis method as described in any one of claims 1 to 6.
9. A computer-readable storage medium, characterized in that, A computer program is stored on the computer-readable storage medium, which, when executed by a processor, implements the steps of the fault diagnosis method as described in any one of claims 1 to 6.