Rotating machinery component unbalance increment fault diagnosis method and device based on gradient distribution correction
By using the gradient distribution correction method, the cumulative gradient information of the bearing fault diagnosis model is calculated and adaptively weighted and knowledge distilled based on the distribution difference perception. This solves the catastrophic forgetting problem of the model in dynamic unbalanced data scenarios and achieves stable diagnosis of both new and old fault types.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SUZHOU UNIV
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-12
AI Technical Summary
In dynamic and unbalanced incremental data scenarios, existing technologies make it difficult for bearing fault diagnosis models to effectively learn new fault types while maintaining the ability to distinguish old fault types, leading to catastrophic amnesia and a decline in diagnostic performance.
By using a gradient distribution correction method, the cumulative gradient information of samples of each category is calculated and adaptively weighted to construct a knowledge distillation strategy that is aware of distribution differences. The gradients of new and old tasks are decoupled and fused, and the model parameters are adjusted to coordinate the learning ability and knowledge retention ability of new and old tasks.
It improves the model's diagnostic stability and recognition performance under incremental fault types, effectively alleviates the imbalance between internal and external tasks, and enhances the ability to distinguish a few types of faults and the overall diagnostic performance.
Smart Images

Figure CN122196754A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of mechanical fault diagnosis and computer artificial intelligence technology, specifically to a method and device for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction. Background Technology
[0002] In recent years, with the increasing complexity of industrial systems, higher demands have been placed on the long-term stable operation and safety assurance of mechanical equipment. To avoid significant economic losses and safety accidents caused by equipment failures, fault diagnosis technology has gradually become an important technical means to ensure the safe and reliable operation of mechanical systems. By continuously monitoring and analyzing the operating status of equipment, accurate fault identification can be achieved, thereby effectively reducing maintenance costs and improving system operating efficiency. As a key basic component in rotating machinery, the operating status of rolling bearings directly affects the stability and reliability of the entire system. However, under actual operating environments such as high speed, heavy load, and complex working conditions, bearings are prone to various failure modes such as wear, cracks, and fatigue failure, and the failure evolution process is insidious and gradual. Therefore, research on high-precision fault diagnosis methods for rolling bearings is of great significance for improving the operational safety of industrial equipment.
[0003] Existing research indicates that vibration signals contain rich equipment status information, making them an important data source for bearing fault diagnosis. Traditional fault diagnosis methods mainly rely on signal processing and feature engineering techniques, such as fault feature frequency analysis, short-time Fourier transform, empirical mode decomposition, and sparse representation. These methods typically rely on expert experience for feature design, have limited adaptability to non-stationary signals under complex operating conditions, and suffer from low processing efficiency and insufficient generalization ability when dealing with large-scale data. Deep learning models can automatically learn discriminative features from raw vibration signals in an end-to-end manner, reducing reliance on manual feature design and demonstrating strong feature representation capabilities and diagnostic accuracy under large-scale data conditions. However, most traditional deep learning methods are usually based on the assumption that training and test data follow an independent and identically distributed (ICD) assumption, which is often difficult to satisfy in real-world industrial scenarios. To overcome the challenges posed by distributional differences, transfer learning techniques have been introduced into the field of rotating machinery fault diagnosis. By transferring knowledge learned from the source domain to the target domain, transfer learning methods can achieve fault identification across operating conditions and equipment even with inconsistent data distributions.
[0004] However, bearing fault diagnosis in real-world industrial environments faces more complex dynamic evolution characteristics. As equipment service life increases, fault modes gradually emerge, degenerate, or are updated, with new fault types constantly being added, while samples of older fault types may be difficult to retain in large quantities. This requires fault diagnosis systems not only to handle time-varying data distributions but also to have the ability to dynamically integrate new knowledge while maintaining old knowledge. Some existing continuous learning methods attempt to address this problem through strategies such as memory replay, knowledge distillation, or network expansion, but they typically assume a relatively balanced data distribution.
[0005] In practical applications, dynamic data often exhibits significant imbalance characteristics: On the one hand, the number of samples for different fault categories within the same incremental task may vary greatly (intra-task imbalance). On the other hand, there may be a severe imbalance between the number of example samples stored in historical tasks and the number of samples in the current new task (task imbalance). This double imbalance problem can exacerbate catastrophic forgetting, that is, the model overwrites the old knowledge it has mastered when learning new fault types, or the imbalance of gradient contributions leads to insufficient learning of minority faults, ultimately reducing the overall diagnostic performance.
[0006] Therefore, how to enable bearing fault diagnosis models to effectively learn new fault types while maintaining their ability to distinguish old fault types in dynamic and unbalanced incremental data scenarios is a technical problem that urgently needs to be solved. Summary of the Invention
[0007] This invention overcomes the shortcomings of the prior art and provides a method and device for diagnosing incremental unbalanced faults in rotating mechanical components based on gradient distribution correction; it has strong adaptability to both new and old tasks and improves the stability and recognition performance of the diagnostic model under incremental fault types.
[0008] To achieve the above objectives, the technical solution adopted by this invention is as follows: a method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction, comprising the following steps: Obtain the vibration data of the bearing to be diagnosed; The bearing vibration data to be diagnosed is input into a pre-trained incremental fault diagnosis model. The incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, obtains the corresponding fault category label, and uses the fault category label as the diagnosis result. The establishment of a pre-trained incremental fault diagnosis model includes: Step S1: Collect vibration data of the bearing at different operating stages and construct an incremental dataset according to the task sequence; Step S2: Construct an incremental fault diagnosis model and train the incremental fault diagnosis model in stages using the incremental dataset; Step S3: During the training of the incremental fault diagnosis model, the cumulative gradient information of each category of samples in the historical training stage is calculated, and the weight coefficient of each category of samples is determined based on the cumulative gradient information. The gradients of different categories of samples are adaptively weighted according to the weight coefficients, and the model parameters of the incremental fault diagnosis model are updated using the weighted gradients to adjust the contribution of each category of samples to the update of model parameters. Step S4: Calculate the distribution difference between new task data and historical task data, and adjust the knowledge distillation process in the incremental fault diagnosis model training based on the distribution difference. Step S5: During model training, the gradients of new tasks and historical tasks are decoupled, and the decoupled gradients are weighted and fused to update the model parameters, so as to coordinate the incremental fault diagnosis model's ability to learn new tasks and retain knowledge of historical tasks.
[0009] In a preferred embodiment of the present invention, step S1, the construction of the incremental dataset includes: The bearing operation process is divided into multiple consecutive stages, and the stage numbers are denoted as follows: The training dataset corresponding to stage t is represented as , where X t Y represents the set of vibration signal samples collected during this phase. t This is the corresponding fault category label set; the data for each stage are arranged sequentially in chronological order, and each stage of data constitutes an incremental task, forming an incremental dataset that arrives gradually over time.
[0010] In a preferred embodiment of the present invention, step S2, the construction of the incremental fault diagnosis model includes: Incremental fault diagnosis model is ,in Represents the model parameters; in stage t, the corresponding training dataset is used. The incremental fault diagnosis model is trained and a classification loss function based on cross-entropy is used. Update the model parameters of the incremental fault diagnosis model.
[0011] In a preferred embodiment of the present invention, in step S3, for each fault category k in the current stage, the cumulative magnitude of its i-th iteration gradient is calculated, and the formula is: ; Where C represents the total number of known fault categories. It is the weight vector corresponding to the k-th class in the fully connected layer during the nth iteration of the model; Calculate the balance ratio That is, the ratio of the minimum cumulative gradient magnitude of all categories to the magnitude of the k-th category; according to The model parameters are updated in a manner that α represents the learning rate; Finally, the Softmax function is optimized, and the updated function calculation method is as follows: ; in, For the output logit of category j, This is a prior probability estimate used to adjust the output distribution.
[0012] In a preferred embodiment of the present invention, in step S4: Calculate the distribution differences between the old and new task data for each fault category. , Let represent the set of fault categories in stage t; define the distribution of its lost training data. for: ; in, This represents the initial number of training samples for fault category k. This indicates the number of example samples stored in this category; Based on the distribution of lost training data Calculate distribution entropy : ; in, Let K be the prior probability of fault category k in the training data, and C be the total number of known fault categories. Determine the weighting coefficients for: ; in, It represents the maximum entropy under the condition of uniform distribution.
[0013] In a preferred embodiment of the present invention, a distributed aligned knowledge distillation loss is constructed. as follows: ; in, Based on the original output logit calculate, Based on the adjusted output logit Calculate; the adjusted output logit We obtain the following formula: ; ; in, The amount of lost training data for class m; calibrated logit The head class j, which has a larger amount of lost training data, produces a greater bias, so a higher intensity of knowledge distillation optimization is assigned to the head class; while the tail class, which has a smaller amount of lost data, only requires mild distillation intervention to maintain the learned knowledge. Distillation loss based on the distribution alignment knowledge The knowledge distillation process during model training is weighted and adjusted to mitigate the impact of data imbalance between tasks.
[0014] In a preferred embodiment of the present invention, in step S5, the gradients of the learned fault types and the new fault types are corrected respectively, and the set of learned fault types is defined as follows: The new set of fault types is ; For each fault type k, a new / old task balancing ratio is introduced. as follows: ; in, This is the ratio of the average cumulative gradient of the learned fault types to that of the new fault types. This is the parameter for adjusting the attenuation factor. and These represent the cumulative size of historical task samples and the total sample size after adding new task samples, respectively. Define the loss balance ratio Calculation formula: ; This is used to balance the magnitudes of the knowledge distillation gradient and the cross-entropy gradient, and to update the model parameters using the following formula: ; in, As a balance factor, The gradient is used to calculate the cross-entropy loss. To align the knowledge distillation loss gradients, the gradients of the new and old tasks are decoupled and fused by weighting and summing the two types of gradients respectively, so as to coordinate the model's ability to learn new tasks and retain knowledge of historical tasks.
[0015] In a preferred embodiment of the present invention, the incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, including the following steps: the incremental fault diagnosis model extracts features from the input bearing vibration data to be diagnosed, outputs the predicted probability corresponding to each fault category through the classification layer, and determines the fault category to which the sample to be diagnosed belongs based on the principle of maximum probability.
[0016] In a preferred embodiment of the present invention, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements a method for diagnosing incremental faults of unbalanced rotating mechanical components based on gradient distribution correction.
[0017] This invention addresses the deficiencies in the technical background, and the beneficial technical effects of this invention are: A method and device for incremental fault diagnosis of unbalanced rotating mechanical components based on gradient distribution correction; it has strong adaptability to new and old tasks, and improves the stability and recognition performance of the diagnostic model under incremental fault types.
[0018] 1. By introducing a class-adaptive weighting mechanism based on historical cumulative gradient information, the model dynamically adjusts the samples of different classes from the perspective of gradient contribution, effectively alleviating the class imbalance problem within the task and improving the model's ability to distinguish minority class faults. 2. By constructing a weighted knowledge distillation strategy based on the differences in the distribution of new and old tasks, the degree of data imbalance is introduced into the distillation process, thereby achieving adaptive control of the intensity of knowledge transfer of different categories, and effectively reducing the interference of sample imbalance between tasks on the model learning process. 3. By decoupling the gradient of new tasks from that of historical tasks and designing a corresponding weighted fusion mechanism, we can achieve collaborative optimization of new and old knowledge during the parameter update process, thereby enhancing the learning ability of new fault types and effectively suppressing catastrophic forgetting. 4. This invention, starting from the perspective of gradient distribution modeling, constructs a unified dual imbalance control mechanism, which enables the model to have stronger robustness and adaptability under incremental fault types and complex imbalance scenarios, thereby improving the incremental fault diagnosis performance as a whole. Attached Figure Description
[0019] The present invention will be further described below with reference to the accompanying drawings and embodiments.
[0020] Figure 1 This is a step diagram illustrating the mechanical equipment fault diagnosis method in a preferred embodiment of the present invention; Figure 2 This is a diagram of the gradient distribution correction replay network model structure in a preferred embodiment of the present invention; Figure 3 This is a schematic diagram illustrating the expansion of diagnostic capabilities in a preferred embodiment of the present invention; Figure 4 This is a schematic diagram comparing the diagnostic results of the present invention with those of different methods; Figure 5 This is a schematic diagram before and after balancing the gradient in a preferred embodiment of the present invention; Figure 6This is a schematic diagram of the confusion matrix obtained after completing the incremental task in a preferred embodiment of the present invention; Figure 7 This is a schematic diagram of feature visualization obtained after completing the incremental task in a preferred embodiment of the present invention; Figure 8 This is a table showing the health status and incremental stage settings of the bearing in a preferred embodiment of the present invention. Detailed Implementation
[0021] The present invention will now be described in further detail with reference to the accompanying drawings and embodiments. These drawings are simplified schematic diagrams, which are only used to illustrate the basic structure of the present invention and therefore only show the components relevant to the present invention.
[0022] It should be noted that if directional indicators (such as up, down, bottom, top, etc.) are involved in the embodiments of the present invention, these directional indicators are only used to explain the relative positional relationship and movement of the components in a specific posture. If the specific posture changes, the directional indicators will also change accordingly. The terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, features defined with "first" and "second" may explicitly or implicitly include one or more of that feature. Unless otherwise explicitly specified and limited, the terms "set," "connected," and "linked" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a direct connection or an indirect connection through an intermediate medium; they can refer to the internal connection of two components. For those skilled in the art, the specific meaning of the above terms in the present invention can be understood according to the specific circumstances.
[0023] Example 1, as Figure 1 , Figure 2 As shown, a method for diagnosing incremental imbalance faults in rotating mechanical components based on gradient distribution correction includes the following steps: Obtain the vibration data of the bearing to be diagnosed; The bearing vibration data to be diagnosed is input into a pre-trained incremental fault diagnosis model. The incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, obtains the corresponding fault category label, and uses the fault category label as the diagnosis result.
[0024] Specifically, the establishment of the pre-trained incremental fault diagnosis model includes: Step S1: Collect vibration data of the bearing at different operating stages and construct an incremental dataset according to the task sequence; Step S2: Construct an incremental fault diagnosis model and train the incremental fault diagnosis model in stages using the incremental dataset.
[0025] Step S3: During the training of the incremental fault diagnosis model, the cumulative gradient information of each category of samples in the historical training stage is calculated, and the weight coefficient of each category of samples is determined based on the cumulative gradient information. The gradients of different categories of samples are adaptively weighted according to the weight coefficients, and the model parameters of the incremental fault diagnosis model are updated using the weighted gradients to adjust the contribution of each category of samples to the update of model parameters.
[0026] Step S4: Calculate the distribution difference between new task data and historical task data, and adjust the knowledge distillation process in the incremental fault diagnosis model training based on the distribution difference.
[0027] Step S5: During model training, the gradients of new tasks and historical tasks are decoupled, and the decoupled gradients are weighted and fused to update the model parameters, so as to coordinate the incremental fault diagnosis model's ability to learn new tasks and retain knowledge of historical tasks.
[0028] Example 2, as Figure 1 , Figure 2 As shown, an incremental fault diagnosis method for unbalanced rotating mechanical components based on gradient distribution correction is proposed. Based on Example 1, the incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, including the following steps: The incremental fault diagnosis model extracts features from the input bearing vibration data to be diagnosed, outputs the predicted probability corresponding to each fault category through the classification layer, and determines the fault category to which the sample to be diagnosed belongs based on the principle of maximum probability.
[0029] The establishment of a pre-trained incremental fault diagnosis model includes: In step S1, vibration data of the bearing at different operating stages are collected, and an incremental dataset is constructed according to the task sequence. The construction of the incremental dataset includes dividing the bearing operation process into multiple consecutive stages, with stage numbers denoted as follows: The training dataset corresponding to stage t is represented as , where X t Y represents the set of vibration signal samples collected during this phase. t This is the corresponding fault category label set; the data for each stage are arranged sequentially in chronological order, and each stage of data constitutes an incremental task, forming an incremental dataset that arrives gradually over time.
[0030] In step S2, an incremental fault diagnosis model is constructed, and the incremental fault diagnosis model is trained stage by stage using the incremental dataset. The construction of the incremental fault diagnosis model includes: the incremental fault diagnosis model is... ,in Represents the model parameters; in stage t, the corresponding training dataset is used. The incremental fault diagnosis model is trained and a classification loss function based on cross-entropy is used. Update the model parameters of the incremental fault diagnosis model.
[0031] In step S3, during the training of the incremental fault diagnosis model, the cumulative gradient information of each category of samples in the historical training stage is calculated, and the weight coefficient of each category of samples is determined based on the cumulative gradient information. The gradients of different categories of samples are adaptively weighted according to the weight coefficients, and the model parameters of the incremental fault diagnosis model are updated using the weighted gradients to adjust the contribution of each category of samples to the update of model parameters.
[0032] Specifically, for each fault category k in the current stage, the cumulative magnitude of the gradient in the i-th iteration is calculated using the following formula: Where C represents the total number of known fault categories. It is the weight vector corresponding to the k-th class in the fully connected layer during the nth iteration of the model; To calculate the gradient of the cross-entropy loss function.
[0033] Calculate the balance ratio That is, the ratio of the minimum cumulative gradient magnitude of all categories to the magnitude of the k-th category.
[0034] according to The model parameters are updated in a manner that α represents the learning rate; For the k-th task, the i-th parameter of the model after the (i+1)-th step update.
[0035] The Softmax function has been optimized, and the updated function calculation method is as follows: ;in, For the output logit of category j, Let j be the prior probability estimate for category j. For the output logit of category m, This is the prior probability estimate for category m; used to adjust the output distribution.
[0036] Specifically, the gradient of the corresponding class samples is adaptively weighted and corrected using class weight coefficients, and the model parameters of the incremental fault diagnosis model are updated based on the corrected gradients to adjust the contribution of different classes to the model parameter update.
[0037] In step S4, the distribution difference between the new task data and the historical task data is calculated, and the knowledge distillation process in the incremental fault diagnosis model training is weighted and adjusted based on the distribution difference.
[0038] This involves calculating the distribution differences between the old and new task data for each fault category. , Let represent the set of fault categories in stage t; define the distribution of its lost training data. for: ;in, This represents the initial number of training samples for fault category k. This indicates the number of example samples stored in this category.
[0039] Based on the distribution of lost training data Calculate distribution entropy : ;in, Let K be the prior probability of fault category k in the training data, and C be the total number of known fault categories.
[0040] Determine the weighting coefficients for: ;in, It is the maximum entropy under the condition of uniform distribution (i.e., the amount of data in all categories is balanced).
[0041] In a preferred embodiment of the present invention, a distributed aligned knowledge distillation loss is constructed. as follows: ; in, Based on the original output logit calculate, Based on the adjusted output logit Calculate; the adjusted output logit We obtain the following formula: ; ; in, The amount of lost training data for class m; calibrated logit The head class j, which has a larger amount of lost training data, produces a greater bias, so a higher intensity of knowledge distillation optimization is assigned to the head class; while the tail class, which has a smaller amount of lost data, only requires mild distillation intervention to maintain the learned knowledge. Distillation loss through distribution alignment knowledge The knowledge distillation process during model training is weighted and adjusted to mitigate the impact of data imbalance between tasks.
[0042] In step S5, during model training, the gradients of the new task and the gradients of the historical task are decoupled, and the decoupled gradients are weighted and fused to update the model parameters, so as to coordinate the incremental fault diagnosis model's ability to learn new tasks and retain knowledge of historical tasks.
[0043] Specifically, the gradients of the learned fault types and the new fault types are corrected separately, and the set of learned fault types is defined as follows: The new set of fault types is ; For each fault type k, a new / old task balancing ratio is introduced. as follows: ; in, This is the ratio of the average cumulative gradient of the learned fault types to that of the new fault types. This is the parameter for adjusting the attenuation factor. and These represent the cumulative size of historical task samples and the total sample size after adding new task samples, respectively. Define the loss balance ratio Calculation formula: Used to adjust the magnitude balance between the knowledge distillation gradient and the cross-entropy gradient, and to update the model parameters as follows: ; in, As a balance factor, The gradient is used to calculate the cross-entropy loss. To align the knowledge distillation loss gradients, the gradients of the new and old tasks are decoupled and fused by weighting and summing the two types of gradients respectively, so as to coordinate the model's ability to learn new tasks and retain knowledge of historical tasks.
[0044] Working principle: An incremental fault diagnosis method for unbalanced rotating machinery components based on gradient distribution correction is proposed. This method collects bearing vibration data from different stages, constructs an incremental dataset according to task order, and sequentially inputs it into the constructed incremental bearing fault diagnosis model for training. During training, historical cumulative gradient information for each category is calculated, and the gradients of different categories are adaptively weighted based on this gradient information to alleviate intra-task imbalance. Simultaneously, the distribution difference between old and new task data is calculated, and the knowledge distillation process is weighted accordingly to reduce the impact of inter-task imbalance. The gradients of old and new tasks are decoupled, and the decoupled gradients are reweighted to update model parameters, improving the model's adaptability to both old and new tasks. The bearing fault diagnosis result is output based on the trained model. This method can effectively complete fault diagnosis tasks in unbalanced incremental scenarios.
[0045] Example 3, the method steps of this embodiment of the invention are illustrated in the following diagram. Figure 1 As shown, the corresponding gradient distribution correction replay network model structure diagram is as follows: Figure 2 , Figure 3 , Figure 4 As shown below. Figure 1 , Figure 2 , Figure 3 , Figure 4 Using rolling bearings as the diagnostic object, this embodiment details the specific implementation process. Figure 2 In this context, β is the balance factor, μ is the balance ratio between new and old tasks, and ε is the loss balance ratio.
[0046] Step S1: Under normal bearing operation and at different fault levels, vibration signals are collected using an accelerometer. The equipment speed is set to 1200 r / min, the load to 0.8 kN, and the sampling frequency to 12.8 kHz. 2048 consecutive data points are collected to form a sample. The data from each stage is divided into several incremental tasks according to time sequence. Let the training dataset corresponding to the t-th task be... , where X t For the vibration signal sample set, Y t This is the corresponding fault category label set.
[0047] The constructed incremental fault dataset includes: health states N, I, O, R, IR, and OR; N represents normal bearing condition, I represents inner ring crack fault, O represents outer ring crack fault, R represents roller crack fault, IR represents inner ring-roller crack fault, and OR represents outer ring-roller crack fault. Except for the normal state, each fault type includes five fault sizes (0.2-0.6 mm). The bearing health states and incremental stages are set as follows: Figure 8 As shown.
[0048] like Figure 8 As shown, the initial task has a total of 5 fault types (labels 0-4). Phases 1-4 are four incremental phases, and each incremental phase contains 5 new fault types (labels 5-9 correspond to phase 1, labels 10-14 correspond to phase 2, labels 15-19 correspond to phase 3, and labels 20-24 correspond to phase 4).
[0049] To simulate data imbalance scenarios commonly encountered in actual industrial production, this embodiment introduces a removal factor into the training samples for each type of fault in each stage. Random elimination is performed. Specifically, in each training phase, let... That is, 20% to 80% of each type of fault sample are randomly retained, and the rest are removed. Through this method, the fault categories in each stage exhibit varying degrees of data imbalance, more closely reflecting real-world operating conditions. The herding algorithm is used for example selection, with a total memory budget of... Set it to 200.
[0050] Step S2, in this embodiment, an incremental fault diagnosis model is constructed, denoted as... , These are the model parameters. The backbone network uses ResNet-18, and the optimizer uses momentum parameters. Stochastic gradient descent (SGD).
[0051] In the initial stage, the initial dataset is used. The model is pre-trained using cross-entropy loss. The model is pre-trained according to the gradient distribution correction optimization strategy described in step S3 to eliminate the training bias caused by the internal class imbalance in the initial stage.
[0052] For each incremental stage t (t≥1), use the current task dataset. And combined with the example library The model is incrementally updated using a small number of example samples stored in the database. The gradient distribution correction strategy proposed in this invention is used for model updates.
[0053] Step S3: To alleviate the training bias problem caused by the imbalance in the number of samples of different categories in the incremental task, this embodiment introduces a category adaptive weighting mechanism based on historical accumulated gradient information to dynamically adjust the gradient contribution of samples of different categories.
[0054] Specifically, let the total number of fault categories currently learned be C, and for each category... The cumulative magnitude of its gradient in the i-th iteration is defined as: ; in, It is the weight vector corresponding to the k-th class in the fully connected layer during the nth iteration of the model. This represents the gradient of the cross-entropy loss function with respect to the model parameters.
[0055] Based on the accumulated gradient magnitude, the balance ratio coefficient corresponding to each category is further calculated: ;in, This represents the minimum cumulative gradient magnitude among all categories, used as a normalization reference, so that categories with larger gradient contributions receive smaller weight coefficients, while categories with smaller gradient contributions receive larger weight coefficients.
[0056] Based on this, the update process of the model parameters is weighted and adjusted, and the update method is as follows: Where α represents the learning rate. This is achieved by introducing the balance ratio coefficient. This enables adaptive adjustment of the gradient update magnitude for different categories, thereby avoiding the dominant role of majority class samples in the model training process.
[0057] Furthermore, to further correct the model output distribution, a correction mechanism based on class prior probability is introduced in the classification layer to adjust the Softmax output, which is expressed as follows: ; in, This represents the output logit value for category j. This represents the prior probability estimate of the corresponding category. This represents the total number of class j in the currently seen training samples. By introducing a prior probability factor, the output probability distribution is recalibrated, enabling the model to pay more attention to low-frequency classes during the prediction phase.
[0058] By using the adaptive weighting and output distribution correction mechanism based on cumulative gradients, the influence of different classes on model training is adjusted in a coordinated manner from the perspectives of gradient update and output probability. This effectively alleviates the problem of class imbalance within the task during incremental learning and improves the model's ability to identify minority class faults.
[0059] Step S4: In this embodiment, to alleviate the knowledge transfer bias problem caused by the imbalance in the number of new and old task samples during incremental learning, a weighted knowledge distillation mechanism based on data distribution differences is introduced to adaptively adjust the distillation intensity for different categories.
[0060] Specifically, for each type of fault in the current stage First, calculate the distribution of training data lost for this category: ;in, This represents the number of samples of category k in the original training data. This indicates the number of samples of this category retained in historical tasks, i.e., the number of stored examples.
[0061] Based on the aforementioned distribution differences, the corresponding category probability distribution is calculated, and the distribution entropy is further obtained: ;in, Let C represent the prior probability of category k, and let C represent the total number of categories currently learned.
[0062] Distillation weighting coefficients are defined based on distribution entropy: ;in, This represents the maximum entropy under a uniform distribution (i.e., a balanced amount of data across all categories). Based on the aforementioned weighting coefficients, a distribution-aligned knowledge distillation loss function is constructed: ;in, This indicates that the logit is based on the original output. Distillation loss, Indicates the output after distribution correction. Distillation loss.
[0063] Output after distribution correction Calculate as follows: ;in, This represents the sum of the differences in the distribution of all categories.
[0064] Using the above methods, for categories with a large degree of data missing, the distillation process will rely more on the output of the teacher model, thereby enhancing the preservation of historical knowledge for that category; while for categories with relatively sufficient data distribution, the distillation constraint strength will be reduced to enhance the model's ability to learn new knowledge.
[0065] By introducing a weighted distillation mechanism that is aware of distribution differences, adaptive adjustment to the imbalance problem between tasks is achieved, thereby improving the rationality of knowledge transfer between new and old tasks.
[0066] Step S5: In this embodiment, in order to further coordinate the balance between learning new tasks and maintaining knowledge of historical tasks, the gradients corresponding to the new and old tasks are decoupled and modeled, and a corresponding weighted fusion strategy is designed.
[0067] Specifically, the set of learned fault types (i.e., the set of historical fault categories currently learned) is defined as... The new fault type set (the set of fault categories newly introduced in the current stage) is defined as follows: .
[0068] For any category k, define the gradient balancing ratio between the old and new tasks. as follows: ; in, This is the ratio of the average cumulative gradient of the learned fault types to that of the new fault types. This is the parameter for adjusting the attenuation factor. and These represent the cumulative size of historical task samples and the total sample size after adding new task samples, respectively. Based on this, the loss balance coefficient is further defined: ;in, The category balance coefficients obtained in step S3 For the classification loss gradient, Align the distribution with knowledge distillation loss gradient.
[0069] Finally, the model parameters are updated as follows: ; Through the aforementioned gradient decoupling and fusion mechanism, the two optimization objectives of classification learning and knowledge distillation are modeled in a unified manner at the gradient level, and dynamic weights are allocated according to the characteristics of new and old tasks, thereby enabling the model to quickly adapt to new tasks and effectively retain knowledge of historical tasks.
[0070] This method establishes a collaborative optimization mechanism between new and old knowledge from the perspective of gradient update, which effectively alleviates the catastrophic forgetting problem in the incremental learning process.
[0071] Step S6: In this embodiment, after completing the model training in steps S1 to S5, the bearing vibration signal to be diagnosed is input into the trained incremental fault diagnosis model. The incremental fault diagnosis model first extracts features from the input signal, outputs the predicted probability corresponding to each fault category through a classification layer, and finally determines the fault category to which the sample to be diagnosed belongs based on the maximum probability principle.
[0072] By using the above methods, accurate identification of different operating stages and different fault types can be achieved, thereby completing the bearing fault diagnosis task.
[0073] In summary, this embodiment presents a fault diagnosis method for bearing imbalance scenarios based on gradient distribution modeling and incremental learning mechanisms. Compared with traditional methods, this embodiment effectively mitigates the impact of intra-task class imbalance and inter-task sample imbalance on the model training process by introducing adaptive weighting of category gradients, knowledge distillation with distribution difference awareness, and decoupling and fusion strategies for new and old task gradients. This improves the stability and recognition performance of the diagnostic model under incremental fault types, making it more suitable for the needs of practical industrial applications.
[0074] Working principle: like Figures 1-3 As shown, a method and device for incremental fault diagnosis of unbalanced rotating mechanical components based on gradient distribution correction is proposed. This method addresses the problem that existing incremental fault diagnosis methods struggle to simultaneously address class imbalance within a task and sample imbalance between tasks in dynamic data environments. This leads to gradient contribution imbalance, catastrophic forgetting, and insufficient ability to identify minority class faults during model training, thus affecting overall diagnostic performance.
[0075] 1. By introducing a class-adaptive weighting mechanism based on historical cumulative gradient information, the model dynamically adjusts the samples of different classes from the perspective of gradient contribution, effectively alleviating the class imbalance problem within the task and improving the model's ability to distinguish minority class faults. 2. By constructing a weighted knowledge distillation strategy based on the differences in the distribution of new and old tasks, the degree of data imbalance is introduced into the distillation process, thereby achieving adaptive control of the intensity of knowledge transfer of different categories, and effectively reducing the interference of sample imbalance between tasks on the model learning process. 3. By decoupling the gradient of new tasks from that of historical tasks and designing a corresponding weighted fusion mechanism, we can achieve collaborative optimization of new and old knowledge during the parameter update process, thereby enhancing the learning ability of new fault types and effectively suppressing catastrophic forgetting. 4. This invention, starting from the perspective of gradient distribution modeling, constructs a unified dual imbalance control mechanism, which enables the model to have stronger robustness and adaptability under incremental fault types and complex imbalance scenarios, thereby improving the incremental fault diagnosis performance as a whole.
[0076] Example 4: A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the incremental fault diagnosis method for unbalanced rotating mechanical components based on gradient distribution correction from any of Examples 1 to 3.
[0077] The above specific embodiments are specific support for the concept proposed in this invention, and should not be used to limit the scope of protection of this invention. Any equivalent changes or modifications made on the basis of this technical solution in accordance with the technical concept proposed in this invention shall still fall within the scope of protection of this invention.
Claims
1. A method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction, characterized in that, Includes the following steps: Obtain the vibration data of the bearing to be diagnosed; The bearing vibration data to be diagnosed is input into a pre-trained incremental fault diagnosis model. The incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, obtains the corresponding fault category label, and uses the fault category label as the diagnosis result. The establishment of a pre-trained incremental fault diagnosis model includes: Step S1: Collect vibration data of the bearing at different operating stages and construct an incremental dataset according to the task sequence; Step S2: Construct an incremental fault diagnosis model and train the incremental fault diagnosis model in stages using the incremental dataset; Step S3: During the training of the incremental fault diagnosis model, the cumulative gradient information of each category of samples in the historical training stage is calculated, and the weight coefficient of each category of samples is determined based on the cumulative gradient information. The gradients of different categories of samples are adaptively weighted according to the weight coefficients, and the model parameters of the incremental fault diagnosis model are updated using the weighted gradients to adjust the contribution of each category of samples to the update of model parameters. Step S4: Calculate the distribution difference between new task data and historical task data, and adjust the knowledge distillation process in the incremental fault diagnosis model training based on the distribution difference. Step S5: During model training, the gradients of new tasks and historical tasks are decoupled, and the decoupled gradients are weighted and fused to update the model parameters, so as to coordinate the incremental fault diagnosis model's ability to learn new tasks and retain knowledge of historical tasks.
2. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 1, characterized in that: In step S1, the construction of the incremental dataset includes: The bearing operation process is divided into multiple consecutive stages, and the stage numbers are denoted as follows: The training dataset corresponding to stage t is represented as , where X t Y represents the set of vibration signal samples collected during this phase. t This is the corresponding fault category label set; the data for each stage are arranged sequentially in chronological order, and each stage of data constitutes an incremental task, forming an incremental dataset that arrives gradually over time.
3. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 2, characterized in that: In step S2, the construction of the incremental fault diagnosis model includes: Incremental fault diagnosis model is ,in Represents the model parameters; in stage t, the corresponding training dataset is used. The incremental fault diagnosis model is trained and a classification loss function based on cross-entropy is used. Update the model parameters of the incremental fault diagnosis model.
4. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 3, characterized in that: In step S3, for each fault category k in the current stage, the cumulative magnitude of its gradient in the i-th iteration is calculated, as follows: ; Where C represents the total number of known fault categories. It is the weight vector corresponding to the k-th class in the fully connected layer during the nth iteration of the model; Calculate the balance ratio That is, the ratio of the minimum cumulative gradient magnitude of all categories to the magnitude of the k-th category; according to The model parameters are updated in a manner that α represents the learning rate; Finally, the Softmax function is optimized, and the updated function calculation method is as follows: ; in, For the output logit of category j, This is a prior probability estimate used to adjust the output distribution.
5. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 4, characterized in that: In step S4: Calculate the distribution differences between the old and new task data for each fault category. , Represents the set of fault categories in the t-th stage; Define Its missing training data distribution for: ; in, This represents the initial number of training samples for fault category k. This indicates the number of example samples stored in this category; Based on the distribution of lost training data Calculate distribution entropy : ; in, Let K be the prior probability of fault category k in the training data, and C be the total number of known fault categories. Determine the weighting coefficients for: ; in, It represents the maximum entropy under the condition of uniform distribution.
6. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 5, characterized in that: Constructing distributed aligned knowledge distillation loss as follows: ; in, Based on the original output logit calculate, Based on the adjusted output logit Calculate; the adjusted output logit We obtain the following formula: ; ; in, The amount of lost training data for class m; calibrated logit The head class j, which has a larger amount of lost training data, produces a greater bias, so a higher intensity of knowledge distillation optimization is assigned to the head class; while the tail class, which has a smaller amount of lost data, only requires mild distillation intervention to maintain the learned knowledge. Distillation loss based on the distribution alignment knowledge The knowledge distillation process during model training is weighted and adjusted to mitigate the impact of data imbalance between tasks.
7. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 6, characterized in that: In step S5, the gradients of the learned fault types and the new fault types are corrected respectively, and the set of learned fault types is defined as follows: The new set of fault types is ; For each fault type k, a new / old task balancing ratio is introduced. as follows: ; in, This is the ratio of the average cumulative gradient of the learned fault types to that of the new fault types. This is the parameter for adjusting the attenuation factor. and These represent the cumulative size of historical task samples and the total sample size after adding new task samples, respectively. Define the loss balance ratio Calculation formula: ; This is used to balance the magnitudes of the knowledge distillation gradient and the cross-entropy gradient, and to update the model parameters using the following formula: ; in, As a balance factor, The gradient is used to calculate the cross-entropy loss. To align the knowledge distillation loss gradients, the gradients of the new and old tasks are decoupled and fused by weighting and summing the two types of gradients respectively, so as to coordinate the model's ability to learn new tasks and retain knowledge of historical tasks.
8. The method for diagnosing incremental unbalance faults in rotating mechanical components based on gradient distribution correction according to claim 1, characterized in that: The incremental fault diagnosis model performs diagnostic analysis on the bearing vibration data to be diagnosed, including the following steps: The incremental fault diagnosis model extracts features from the input bearing vibration data to be diagnosed, outputs the predicted probability corresponding to each fault category through the classification layer, and determines the fault category to which the sample to be diagnosed belongs based on the principle of maximum probability.
9. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the incremental fault diagnosis method for unbalanced rotating mechanical components based on gradient distribution correction as described in any one of claims 1 to 8.