Tool state cross-domain prediction method based on segmented feature alignment and trusted pseudo label
By using segmented feature alignment and reliable pseudo-label filtering, the problem of sample feature aliasing in cross-domain prediction of tool status is solved, achieving high-precision tool wear prediction in unlabeled target domains and improving the stability and accuracy of the model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HARBIN INST OF TECH
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing cross-domain tool condition prediction methods cannot accurately align conditional distributions, resulting in overlapping sample features from different wear stages and reducing the accuracy of regression prediction models.
A method based on segmented feature alignment and credible pseudo-labels is adopted. By classifying and clustering source and target domain data, credible pseudo-labels are selected to achieve accurate alignment and transfer of cross-domain regression prediction models.
Without requiring target domain labeled data, it improves the accuracy and reliability of cross-domain prediction of tool state, reduces the risk of negative migration, and enhances the model's prediction accuracy in the target domain.
Smart Images

Figure CN122241308A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of intelligent manufacturing and machining. Background Technology
[0002] With the development of CNC machining technology and intelligent manufacturing, the changes in the state of cutting tools during the machining process directly affect machining accuracy, efficiency, and equipment safety. Therefore, predicting tool wear or remaining service life has become an important research direction in the field of intelligent manufacturing. In recent years, with the development of various sensor technologies such as vibration, current, and acoustic emission, researchers have been able to collect multi-source sensor signals during the machining process and use data-driven methods to build predictive models, thereby achieving online monitoring and prediction of tool wear.
[0003] In practical industrial applications, due to variations in processing materials, processing parameters, tool types, and machine tool equipment, data collected under different processing conditions often exhibits significant distributional differences. Data with labels is typically referred to as source domain data, while data from different processing conditions but lacking labels is called target domain data. When a prediction model trained on source domain data is directly applied to target domain data, the model's predictive performance often deteriorates significantly due to the distributional differences between the two domains. Therefore, how to effectively transfer tool condition prediction models between different processing conditions has become a crucial issue in tool condition monitoring research.
[0004] Existing research mainly employs two types of methods to achieve cross-domain transfer of tool condition prediction models. The first type is transfer learning based on model fine-tuning. This method typically trains a prediction model on source domain data and then retrains or fine-tunes the model using a small amount of labeled data from the target domain, gradually adapting the model to the data distribution of the target domain. However, this type of method relies on manual labeling of the target domain data, while in real-world industrial environments, obtaining tool wear labels usually requires offline measurement or manual inspection, which is not only complex but also costly. The second type is transfer learning based on domain adaptation. This type of method typically introduces feature distribution alignment strategies during model training, minimizing the difference in feature distribution between the source and target domains, enabling the model to maintain good predictive ability in the target domain. Compared to fine-tuning-based methods, this type of method usually does not require labeled target domain data, thus having an advantage when target domain data is difficult to label. However, in regression tasks such as tool condition prediction, the aforementioned domain adaptation methods still have certain limitations. Existing methods primarily achieve cross-domain transfer by aligning the overall feature distributions of the source and target domains. However, in tasks with distinct stages or degradation processes, such as tool wear, the data distributions at different degradation stages differ significantly. This leads to aliasing of data from different degradation stages in the feature space, making it impossible to accurately align the conditional distributions. Consequently, the aliasing of sample features from different wear stages results in negative transfer in the regression prediction model, reducing its accuracy in predicting tool condition. Therefore, in situations where labeled data is lacking in the target domain, how to fully utilize existing source domain information and unlabeled data in the target domain to construct a cross-domain regression prediction method that can accurately align the feature distributions of different wear stages and effectively utilize reliable pseudo-label information remains a crucial technology that urgently needs to be explored in the field of tool condition prediction. Summary of the Invention
[0005] The purpose of this invention is to address the problem that existing cross-domain tool condition prediction methods cannot accurately align conditional distributions, resulting in the aliasing of sample features at different wear stages, leading to negative migration in the regression prediction model and reduced accuracy of tool condition prediction. This invention provides a cross-domain tool condition prediction method based on segmented feature alignment and reliable pseudo-labels.
[0006] A cross-domain tool state prediction method based on segmented feature alignment and reliable pseudo-labels includes:
[0007] S1. Data Acquisition and Preprocessing: Acquire multi-channel sensor data and corresponding tool wear value labels collected at each machining step during the source domain tool machining process, and acquire unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process; preprocess the data under the tool in the source domain and target domain to obtain the source domain dataset and the target domain dataset.
[0008] The multi-channel sensor data collected at each machining step during the source domain tool machining process and their corresponding tool wear value labels are used as the input and output data of the source domain regression samples in the source domain dataset, respectively; the unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process are used as the input data of the target domain regression samples in the target domain dataset.
[0009] The source domain dataset and the target domain dataset have the same number of channels for multi-channel sensor data, and the sensor data categories may be the same or different.
[0010] S2. Source Domain Regression Model Training: The source domain regression model is trained using regression samples from each source domain.
[0011] S3. Classification Prior Guidance: Cluster the wear value labels of the source domain dataset to obtain the classification labels of the source domain dataset. The classification labels include health and wear.
[0012] The multi-channel sensor data and their corresponding classification labels corresponding to each processing step in the source domain dataset are used as the input and output data of the source domain classification samples, respectively, to train the source domain classification model.
[0013] S4. Cross-domain classification feature alignment:
[0014] Based on source domain classification samples and target domain classification samples, the distribution difference measure algorithm (MMD algorithm) is used to transfer the trained source domain classification model to the target domain to obtain the target domain classification model;
[0015] The multi-channel sensor data corresponding to each processing step in the target domain dataset is used as the input data for the target domain classification sample; the target domain classification model is used to infer the input data of each target domain classification sample to obtain the classification label of the target domain classification sample.
[0016] S5. Cross-domain preliminary segmentation and alignment regression features: Based on the classification samples with the same classification label in the source domain dataset and the target dataset, the distribution difference measurement algorithm is used to initially transfer the trained source domain regression model to the target domain to obtain the preliminary transferred target domain regression model.
[0017] S6. Confidential pseudo-label screening: Use the target domain regression model of the preliminary transfer to extract features and infer the input data of each regression sample in the source domain dataset and the target domain dataset respectively, and obtain the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values accordingly.
[0018] Based on the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values, reliable pseudo-labels for the target domain are selected from the target domain predicted values and used as reliable pseudo-labels for the target domain regression samples.
[0019] The input data of each target domain regression sample and its corresponding target domain credible pseudo-label are used as the output data and input data of the candidate regression sample, respectively.
[0020] S7. Cross-domain fine-alignment regression features: Based on the regression samples and candidate regression samples of each source domain, the distribution difference measurement algorithm is used to transfer the initially transferred target domain regression model to the target domain again to obtain the final target domain regression model.
[0021] S8, Actual Reasoning Stage:
[0022] Multi-channel sensor data is collected during the current target domain tool machining process and used as input data for the final target domain regression model to predict tool wear value, thus obtaining the tool wear value prediction result.
[0023] Preferably, the implementation method for preprocessing the data under the tool in the source domain and target domain in step S1 includes: data filtering and cleaning.
[0024] Preferably, in step S4, the trained source domain classification model is transferred to the target domain to obtain the target domain classification model, as follows:
[0025] Take any source domain classification sample from the source domain dataset and any target domain classification sample from the target dataset as a set of classification samples;
[0026] The input data of the two classification samples in each group of classification samples are used as the input data of the trained source domain classification model, and the classification label of the source domain classification sample in the group of classification samples is used as the true value.
[0027] While training the source domain classification model using source domain classification samples from each group of classification samples, a distribution difference metric algorithm is used to calculate and minimize the difference in classification features between the input data of two classification samples in each group of classification samples input to the trained source domain classification model, thereby obtaining the target domain classification model.
[0028] Preferably, in step S5, the trained source domain regression model is initially transferred to the target domain to obtain the initially transferred target domain regression model, which is achieved as follows:
[0029] Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples, and the tool wear value label category of the two regression samples in this set of regression samples is the same.
[0030] The input data of the two regression samples in each group of regression samples are used as the input data of the trained source domain regression model, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value.
[0031] While training the source domain regression model using source domain regression samples from each group of regression samples, the distribution difference metric algorithm is used to calculate the difference in regression features between the input data of two regression samples in each group of regression samples input to the trained source domain regression model, and minimizes it to obtain the initial transfer target domain regression model.
[0032] Preferably, in step S6, the method for using the preliminary transfer target domain regression model to extract features and infer the input data of each source domain regression sample in the source domain dataset, and thus obtaining the source domain regression features and source domain predicted values, is as follows:
[0033] The target domain regression model of the initial transfer is used to infer the input data of each source domain regression sample, predict the corresponding source domain prediction value, and extract the source domain regression features in the inference process of the target domain regression model of the initial transfer.
[0034] In step S6, feature extraction and inference are performed using the input data of each target domain regression sample in the target domain dataset of the initial transfer target domain regression model. The corresponding implementation method for obtaining the target domain regression features and target domain predicted values is as follows:
[0035] The target domain regression model with preliminary transfer is used to infer the features of the input data of each target domain regression sample, predict the corresponding target domain prediction value, and extract the target domain regression features in the inference process of the target domain regression model with preliminary transfer.
[0036] Preferably, in step S6, the method for filtering out the reliable pseudo-labels of the target domain from the predicted values of the target domain is as follows:
[0037] S51. Divide all the target domain predicted values and their corresponding target domain regression features under each processing step into a set of target domain data;
[0038] S52. Use all source domain regression features and source domain predicted values to filter a set of target domain data under each processing step, and select candidate pseudo-labels under that processing step.
[0039] S53. Perform spatial clustering analysis on the candidate pseudo-labels under each processing step. When the clustering results have a main cluster and the proportion of candidate pseudo-labels in the main cluster exceeds the threshold, the average value of the candidate pseudo-labels in the main cluster is taken as the target domain credible pseudo-label of all target domain regression samples under that processing step.
[0040] Preferably, the implementation method for S52, which filters a set of target domain data under each processing step, is as follows:
[0041] Predicted values of the target domain that meet the screening criteria As an alternative pseudo-label;
[0042] The filtering criteria are:
[0043] ;
[0044] ;
[0045] in, The first step in the current processing sequence The predicted value of the target domain corresponding to each target domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. To find the source domain predicted value among all source domain regression samples in the source domain dataset and the target domain predicted value. The source domain prediction value with the smallest difference. Predicted values for the target domain The target domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, This represents the total number of predicted values for all source domains in the source domain dataset.
[0046] Preferably, in step S7, the method for transferring the initially transferred target domain regression model back to the target domain is as follows:
[0047] Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples;
[0048] The input data of the two regression samples in each group of regression samples are used as the input data of the target domain regression model for the initial transfer, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value.
[0049] While training the target domain regression model using source domain regression samples from each group of regression samples, the algorithm of distribution difference measure is used to calculate and minimize the difference in regression features between the input data of two regression samples in each group of regression samples input to the target domain regression model.
[0050] Preferably, the K-Means clustering algorithm is used to cluster the wear value labels of the source domain dataset in step S3.
[0051] Preferably, the distribution difference measurement algorithm is implemented using the MMD algorithm.
[0052] The beneficial effects of this invention are:
[0053] This invention proposes a cross-domain tool condition prediction method based on segmented feature alignment and reliable pseudo-label filtering. By segmenting and aligning data features at different wear stages and combining this with a reliable pseudo-label filtering mechanism, it effectively alleviates the problem of sample overlap at different stages in traditional overall distribution alignment methods, achieving more stable and accurate cross-domain tool condition prediction even when the target domain is unlabeled. Compared with existing technologies, this invention can improve the reliability and prediction accuracy of cross-domain regression prediction models without requiring labeled data in the target domain, demonstrating significant technical improvements and inventiveness. Specific advantages are as follows:
[0054] 1. This invention guides the regression transfer process by introducing classification priors and adopts a segmented feature alignment strategy, which can avoid the overlap of samples at different degradation stages and reduce the risk of negative transfer.
[0055] 2. This invention constructs a reliable pseudo-label screening mechanism that combines prediction error, feature similarity, and statistical consistency, which can effectively suppress the interference of erroneous pseudo-labels on model training;
[0056] 3. This invention employs an auxiliary alignment method using trusted pseudo-labels, which can improve the accuracy of tool wear prediction even when there are no real labels in the target domain, and has good engineering application value. Attached Figure Description
[0057] Figure 1 This is a flowchart of the cross-domain prediction method for tool state based on segmented feature alignment and credible pseudo-labels described in this invention;
[0058] Figure 2 This is a timing data diagram of the target domain with 7 channels; where ch1 to ch7 represent the vibration signals of the machine tool spindle in the X-axis direction, the vibration signals of the machine tool spindle in the Y-axis direction, the vibration signals of the machine tool spindle in the Z-axis direction, the power signals of the machine tool power input phase A, the power signals of the machine tool power input phase B, the power signals of the machine tool power input phase C, and the temperature signals of the machining area during machine tool cutting, respectively.
[0059] Figure 3 This is a graph showing the tool wear prediction results of this method in the case of no labels in the target domain; where,
[0060] Figure 3 (a) shows the prediction results on tool No. 1 in the target domain after using the method of the present invention;
[0061] Figure 3Figure (b) shows the prediction results on tool No. 2 in the target domain after using the method of the present invention. Detailed Implementation
[0062] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0063] It should be noted that, unless otherwise specified, the embodiments and features described in the present invention can be combined with each other.
[0064] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, but this is not intended to limit the scope of the invention.
[0065] Specific Implementation Method 1: Combination Figure 1 This embodiment describes a cross-domain tool state prediction method based on segmented feature alignment and reliable pseudo-labels, which includes:
[0066] S1. Data Acquisition and Preprocessing: Acquire multi-channel sensor data and corresponding tool wear value labels collected at each machining step during the source domain tool machining process, and acquire unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process; preprocess the data under the tool in the source domain and target domain to obtain the source domain dataset and the target domain dataset.
[0067] Data preprocessing can be achieved through methods such as data filtering and cleaning.
[0068] The source domain dataset and the target domain dataset contain the same number of multi-channel sensor data, and the sensor data categories may be the same or different. In application, the multi-channel sensor data in the target domain can be: the X, Y, and Z axis vibration signals of the machine tool spindle where the tool is located, the A, B, and C phase power signals of the machine tool power input, and the temperature signal of the cutting area during machine tool machining. The multi-channel sensor data in the source domain can be: the X, Y, and Z axis vibration signals of the machine tool spindle where the tool is located, the X, Y, and Z axis cutting force signals of the machine tool spindle, and the acoustic emission signals during machine tool machining.
[0069] The multi-channel sensor data collected at each machining step during the source domain tool machining process and their corresponding tool wear value labels are used as the input and output data of the source domain regression samples in the source domain dataset, respectively; the unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process are used as the input data of the target domain regression samples in the target domain dataset.
[0070] In practical applications, the sample's input and output data are also used as the model's input and output data, respectively.
[0071] S2. Source Domain Regression Model Training: The source domain regression model is trained using regression samples from each source domain.
[0072] S3. Classification Prior Guidance: Cluster the wear value labels of the source domain dataset to obtain the classification labels of the source domain dataset. The classification labels include health and wear. Specifically, the K-Means clustering algorithm is used to cluster the wear value labels of the source domain dataset.
[0073] The multi-channel sensor data and their corresponding classification labels corresponding to each processing step in the source domain dataset are used as the input and output data of the source domain classification samples, respectively, to train the source domain classification model.
[0074] S4. Cross-domain classification feature alignment:
[0075] Based on source domain classification samples and target domain classification samples, a distribution difference measurement algorithm is used to transfer the trained source domain classification model to the target domain to obtain the target domain classification model.
[0076] The multi-channel sensor data corresponding to each processing step in the target domain dataset is used as the input data for the target domain classification sample; the target domain classification model is used to infer the input data of each target domain classification sample to obtain the classification label of the target domain classification sample.
[0077] S5. Cross-domain preliminary segmentation and alignment regression features: Based on the classification samples with the same classification label in the source domain dataset and the target dataset, the distribution difference measurement algorithm is used to initially transfer the trained source domain regression model to the target domain to obtain the preliminary transferred target domain regression model.
[0078] S6. Confidential pseudo-label screening: Use the target domain regression model of the preliminary transfer to extract features and infer the input data of each regression sample in the source domain dataset and the target domain dataset respectively, and obtain the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values accordingly.
[0079] Based on the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values, reliable pseudo-labels for the target domain are selected from the target domain predicted values and used as reliable pseudo-labels for the target domain regression samples.
[0080] The input data of each target domain regression sample and its corresponding target domain credible pseudo-label are used as the output data and input data of the candidate regression sample, respectively.
[0081] S7. Cross-domain fine-alignment regression features: Based on the regression samples and candidate regression samples of each source domain, the distribution difference measurement algorithm is used to transfer the initially transferred target domain regression model to the target domain again to obtain the final target domain regression model.
[0082] S8, Actual Reasoning Stage:
[0083] Multi-channel sensor data is collected during the current target domain tool machining process and used as input data for the final target domain regression model to predict tool wear value, thus obtaining the tool wear value prediction result.
[0084] When applied, the distribution difference measurement algorithm can be the MMD algorithm.
[0085] This invention proposes a cross-domain tool wear prediction method based on segmented feature alignment and reliable pseudo-labels. By integrating unsupervised domain adaptation and pseudo-label selection techniques, it effectively solves the problem of decreased prediction accuracy caused by differences in tool wear data distribution under different machining conditions. The core innovation of this method lies in achieving high-precision tool wear prediction model transfer from the source domain to the target domain through segmented feature alignment guided by classification priors and combined with a reliable pseudo-label selection mechanism.
[0086] By performing two types of feature alignment (cross-domain classification feature alignment and regression feature alignment) in classification and regression tasks, the distribution difference between source and target domain data in the feature space is effectively reduced, thereby significantly improving the model's predictive generalization ability in the target domain.
[0087] To enhance alignment accuracy, a classification prior is introduced. By clustering the wear values in the source domain, classification prior information on health and wear is obtained, and the source domain classification model is trained. During the transfer process, the classification model guides the regression model to perform segmented feature alignment according to different wear stages, avoiding the overlapping alignment of data from different wear states, making feature matching more physically meaningful and accurate.
[0088] To address the issue of unlabeled target domains, this invention employs a reliable pseudo-label filtering mechanism. After initial transfer learning, high-confidence pseudo-label samples are selected from the target domain predictions through comparative analysis of source domain features and target domain predictions. These samples act as a bridge, providing supervisory signals for the model's fine-grained alignment and enabling effective learning on completely unlabeled target domains.
[0089] This invention employs a phased transfer strategy to ensure transfer stability, specifically a phased strategy of "preliminary transfer - pseudo-label screening - fine transfer". First, classification alignment is used to coarsely adjust the feature space; then, a preliminary regression model is used to screen reliable samples; finally, fine feature alignment is performed based on these samples. This progressive transfer method effectively prevents negative transfer and ensures that the model gradually adapts to the characteristics of the target domain data.
[0090] The method of this invention, through classification-guided segment alignment and iterative optimization of reliable pseudo-labels, has successfully achieved high-precision wear prediction across working conditions and tooling conditions, and has strong practical value and generalization ability.
[0091] Furthermore, in step S4, the trained source domain classification model is transferred to the target domain to obtain the target domain classification model. The implementation method is as follows:
[0092] Take any source domain classification sample from the source domain dataset and any target domain classification sample from the target dataset as a set of classification samples;
[0093] The input data of the two classification samples in each group of classification samples are used as the input data of the trained source domain classification model, and the classification label of the source domain classification sample in the group of classification samples is used as the true value.
[0094] While training the source domain classification model using source domain classification samples from each group of classification samples, a distribution difference metric algorithm is used to calculate and minimize the difference in classification features between the input data of two classification samples in each group of classification samples input to the trained source domain classification model, thereby obtaining the target domain classification model.
[0095] Step S4 provides a specific method for achieving cross-domain classification feature alignment. This involves randomly pairing classification samples from the source and target domains, and simultaneously inputting the input data of the paired samples into the trained source domain classification model. While using the classification labels of the source domain samples as ground truth for supervised training, a distribution difference measurement algorithm is introduced to calculate and minimize the classification feature difference between each pair of sample input data. This simultaneously completes the preservation of the classification task and the alignment of feature distributions between domains during the same training process. As a result, the final target domain classification model can accurately inherit the prior knowledge of tool health and wear status from the source domain without any target domain labels, and effectively extract domain-invariant feature representations.
[0096] Furthermore, in step S5, the trained source domain regression model is initially transferred to the target domain, and the implementation method for obtaining the initially transferred target domain regression model is as follows:
[0097] Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples, and the tool wear value label category of the two regression samples in this set of regression samples is the same.
[0098] The input data of the two regression samples in each group of regression samples are used as the input data of the trained source domain regression model, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value.
[0099] While training the source domain regression model using source domain regression samples from each group of regression samples, the distribution difference metric algorithm is used to calculate the difference in regression features between the input data of two regression samples in each group of regression samples input to the trained source domain regression model, and minimizes it to obtain the initial transfer target domain regression model.
[0100] In step S5, a pairing mechanism guided by classification priors is introduced to achieve the initial transfer of the source domain regression model to the target domain. The core of this approach is to restrict feature alignment to the same wear state category. Specifically, the method first pairs regression samples belonging to the same category (both healthy or both worn) in the source and target domains based on the classification labels obtained in step S3. Then, the input data of the paired samples are simultaneously input into the trained source domain regression model. While using the true wear values of the source domain samples for supervised regression training, a distribution difference measurement algorithm is introduced to calculate and minimize the difference between each pair of samples in the regression feature space. The key effect of this implementation is that by aligning features within the same wear stage, it avoids the confusion of physical meaning caused by mixed alignment of data distributions under different wear states. Thus, while preserving the source domain regression model's ability to predict wear values, it specifically reduces the feature distribution difference between the source and target domains within the same wear stage, providing a more consistent and physically clear initial model for subsequent screening of reliable pseudo-labels and fine-tuning of the model.
[0101] Furthermore, in step S6, the target domain regression model of the preliminary transfer is used to extract features and infer inference from the input data of each source domain regression sample in the source domain dataset. The corresponding implementation method for obtaining the source domain regression features and source domain predicted values is as follows:
[0102] The target domain regression model of the initial transfer is used to infer the input data of each source domain regression sample, predict the corresponding source domain prediction value, and extract the source domain regression features in the inference process of the target domain regression model of the initial transfer.
[0103] In step S6, feature extraction and inference are performed using the input data of each target domain regression sample in the target domain dataset of the initial transfer target domain regression model. The corresponding implementation method for obtaining the target domain regression features and target domain predicted values is as follows:
[0104] The target domain regression model with preliminary transfer is used to infer the features of the input data of each target domain regression sample, predict the corresponding target domain prediction value, and extract the target domain regression features in the inference process of the target domain regression model with preliminary transfer.
[0105] Furthermore, in step S6, the method for filtering out trustworthy pseudo-labels for the target domain from the predicted values of the target domain is as follows:
[0106] S51. Divide all the target domain predicted values and their corresponding target domain regression features under each processing step into a set of target domain data;
[0107] S52. Use all source domain regression features and source domain predicted values to filter a set of target domain data under each processing step, and select candidate pseudo-labels under that processing step.
[0108] Specifically, the method for filtering a set of target domain data under each processing step is as follows:
[0109] Predicted values of the target domain that meet the screening criteria As an alternative pseudo-label;
[0110] The filtering criteria are:
[0111] ;
[0112] ;
[0113] in, The first step in the current processing sequence The predicted value of the target domain corresponding to each target domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. To find the source domain predicted value among all source domain regression samples in the source domain dataset and the target domain predicted value. The source domain prediction value with the smallest difference. Predicted values for the target domain The target domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, This represents the total number of predicted values for all source domains in the source domain dataset.
[0114] S53. Perform spatial clustering analysis on the candidate pseudo-labels under each processing step. When the clustering results have a main cluster and the proportion of candidate pseudo-labels in the main cluster exceeds the threshold, the average value of the candidate pseudo-labels in the main cluster is taken as the target domain credible pseudo-label of all target domain regression samples under that processing step.
[0115] In step S6, a screening mechanism combining dual constraints and spatial clustering verification is constructed to extract high-confidence target domain credible pseudo-labels from the preliminary transfer model prediction results. Specifically, all target domain samples are first grouped according to the processing step sequence. For each predicted value within each group, the source domain prediction value with the smallest difference from its wear value is found in the source domain. Furthermore, the difference between this target domain prediction value and its nearest source domain prediction value is required to be less than the minimum difference among all source domain prediction values. Simultaneously, the distance between the regression features corresponding to these two prediction values is required to be less than the feature distance between the nearest samples within the source domain. This ensures high similarity between the candidate pseudo-labels and known samples in the source domain across both the prediction value space and feature space dimensions. Subsequently, spatial clustering analysis is performed on all candidate pseudo-labels selected under the same processing step sequence. Only when the clustering results form a clear main cluster and the proportion of samples within the main cluster exceeds a preset threshold is the mean of that main cluster used as the final credible pseudo-label for that processing step sequence. This approach defines the credible range from the prior distribution of the source domain through nearest neighbor comparison, effectively eliminating unreliable samples with unstable predictions or distribution edges. Furthermore, clustering principal cluster analysis further eliminates the interference of outlier noise, ensuring that the pseudo-labels used in the end are statistically representative and robust, thus providing reliable and concentrated supervisory signals for subsequent fine-grained transfer.
[0116] Furthermore, in step S7, the method for transferring the initially transferred target domain regression model back to the target domain is as follows:
[0117] Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples;
[0118] The input data of the two regression samples in each group of regression samples are used as the input data of the target domain regression model for the initial transfer, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value.
[0119] While training the target domain regression model using source domain regression samples from each group of regression samples, the algorithm of distribution difference measure is used to calculate and minimize the difference in regression features between the input data of two regression samples in each group of regression samples input to the target domain regression model.
[0120] In step S7, based on the obtained credible pseudo-labels, the method further refines the transfer of the initial target domain regression model to the target domain. This is achieved by continuing the paired sample alignment strategy, pairing any regression sample from the source domain with a regression sample from the target domain bearing a credible pseudo-label. The input data of the paired samples are simultaneously input into the initial transfer model. While using the true wear values of the source domain samples for supervised training to maintain regression accuracy, a distribution difference measurement algorithm is introduced again to calculate and minimize the difference between each pair of samples in the regression feature space. The core effect of this process is that by introducing target domain samples with high-quality pseudo-labels to participate in feature alignment, a transition from relying solely on source domain supervision signals to incorporating target domain information is achieved. This allows the model to directly perceive the data structure of the target domain during feature alignment, thereby achieving a more refined balance between prior knowledge of the source domain and the true distribution of the target domain, ultimately obtaining a final regression model highly adapted to the characteristics of the target domain.
[0121] Verification experiment:
[0122] The technical effect of the present invention will be verified through the following example 1, as follows:
[0123] The source domain data comes from the publicly labeled tool wear experimental dataset PHM2010. This source domain dataset includes 7 channels of data, including 3-axis vibration signals (X, Y, and Z axis vibration signals of the machine tool spindle), 3-axis cutting force signals (X, Y, and Z axis cutting force signals of the machine tool spindle), and acoustic emission signals, as well as the corresponding tool wear values. That is, the dataset includes data for three milling cutters and corresponding regression labels (i.e., tool wear values). Each milling cutter includes 315 milling processes under the same cutting parameters.
[0124] The target domain data comes from sensor signals collected under machining conditions that differ from the source domain dataset. By constructing a feature extraction network and a regression prediction network, and combining them with steps S1 to S8 above, the tool wear state in the target domain is predicted. The specific steps are as follows:
[0125] (1) Install vibration sensors, current sensors, voltage sensors, and other sensors on the milling machine tool in the target domain to collect the three-axis vibration signals of the spindle (X, Y, and Z axis vibration signals of the machine tool spindle), the power signals of the A, B, and C phases of the motor, and the temperature signal of the cutting area during machining, for a total of 7 channels of data. Save them according to the machining sequence. The collected 7-channel timing data is as follows: Figure 2As shown, a total of two milling cutters (target domain cutter 1 and target domain cutter 2) were used on the same milling machine tool for cutting experiments. Each cutter included 100 cutting processes under the same machining parameters. In addition, after each machining, the tool wear value was measured and recorded using a microscope. The tool wear was mapped one-to-one with the multi-channel sensor data to construct the target domain dataset.
[0126] Figure 2 In the diagram, the horizontal and vertical axes represent the values of the sensor data at the sampling points and corresponding times, respectively.
[0127] (2) Use the K-Means clustering algorithm to cluster the labels of the source domain data to obtain the classification labels of the source domain data. Use this to train a one-dimensional convolutional neural network (1DCNN) to obtain a source domain tool state classification model. Use this classification model to extract features from the source domain data and the target domain data to obtain the classification features of the source domain data and the target domain data respectively. Use the maximum mean difference (MMD) to measure the difference in the distribution of classification features between the source domain data and the target domain data, and minimize this difference during the training process to obtain a target domain classification model adapted to the target domain.
[0128] (3) Use source domain data and source domain regression labels to train 1DCNN to obtain source domain regression prediction model, and extract features from source domain data and target domain data respectively to obtain regression features of source domain data and regression features of target domain data; use the target domain classification model obtained in (2) to classify target domain data to obtain degradation stage classification information corresponding to regression features; use segmented MMD to calculate the difference in regression feature distribution of source domain and target domain regression features and minimize the difference during training to initially complete cross-domain transfer and obtain the initial transferred target domain regression model;
[0129] (4) Based on the target domain regression model of the preliminary migration, predict the source domain data to obtain the predicted value of the source domain. Corresponding regression features To predict the target domain data, obtain the predicted values of the target domain samples. and their corresponding regression features All target domain predicted values and corresponding target domain regression features are grouped according to the processing steps.
[0130] Select a step sequence group that has not undergone screening, and predict the target domain value for each group. Calculate the difference between the label and all source domain predictions, and select the source domain label with the smallest difference. and corresponding source domain regression features Judge separately:
[0131] ;
[0132] ;
[0133] in, Predicted values for the target domain The target domain regression features, if the predicted value and corresponding features If both criteria are met, the predicted value is added to the candidate pseudo-label of the group.
[0134] After all the target domain predictions in this group have been judged as described above, DBSCAN clustering is used on the candidate pseudo-labels in the group. The judgment is based on the main cluster. If there is a main cluster in this step and the number of samples in the main cluster accounts for more than 50%, the average value of the candidate pseudo-labels that make up the main cluster is taken as the reliable pseudo-label of this step. Otherwise, the candidate pseudo-label of this step is eliminated.
[0135] Iterate through all step sequence groups until each target domain group has undergone the above filtering steps. Assign a one-to-one correspondence between the adopted trusted pseudo-labels and their corresponding step sequences and corresponding multi-channel sensor data to form a target domain subset containing the trusted pseudo-labels.
[0136] (5) Using the target domain credible pseudo-labels obtained in step (4) as a reference, the regression feature centers of the source domain and the target domain are aligned. The distribution difference measurement algorithm is used to transfer the initially migrated target domain regression model to the target domain again to obtain the final target domain regression model, that is: the final cross-domain prediction model of tool status is obtained.
[0137] Then, the transferred model (i.e., the final target domain regression model) is evaluated using the wear value labels from the target domain dataset. It should be noted that the target domain data label information is not used during model transfer and training; it is only used during model evaluation to verify the accuracy of the prediction results. The prediction performance of the regression model after the above adaptation process on the target domain datasets for tools 1 and 2 is as follows: Figure 3 As shown in (a) and (b).
[0138] in, Figure 3 (a) represents the prediction result on tool data of target domain 1 after applying the method of the present invention. Figure 3 (b) represents the prediction result on tool data of target domain 2 after applying the method of the present invention. Figure 3As can be seen, after the aforementioned cross-domain adaptation process, the prediction results of the obtained regression model on the target domain data are basically consistent with the actual wear value change trend, achieving a relatively accurate prediction effect. This result demonstrates that, through segmented feature alignment and a reliable pseudo-label filtering mechanism, the proposed method can effectively achieve cross-domain prediction of tool condition even when the target domain lacks true labels, improving the prediction accuracy and reliability of the cross-domain regression prediction model, thus verifying the effectiveness of the method in cross-domain prediction of tool condition.
[0139] While the invention has been described herein with reference to specific embodiments, it should be understood that these embodiments are merely examples of the principles and applications of the invention. Therefore, it should be understood that many modifications can be made to the exemplary embodiments, and other arrangements can be designed without departing from the spirit and scope of the invention as defined by the appended claims. It should be understood that different dependent claims and features described herein can be combined in ways different from those described in the original claims. It is also understood that features described in conjunction with individual embodiments can be used in other described embodiments.
Claims
1. A cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels, characterized in that, include: S1. Data Acquisition and Preprocessing: Acquire multi-channel sensor data and corresponding tool wear value labels collected at each machining step during the source domain tool machining process, and acquire unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process; preprocess the data under the tool in the source domain and target domain to obtain the source domain dataset and the target domain dataset. The multi-channel sensor data collected at each machining step during the source domain tool machining process and their corresponding tool wear value labels are used as the input and output data of the source domain regression samples in the source domain dataset, respectively; the unlabeled multi-channel sensor data collected at each machining step during the target domain tool machining process are used as the input data of the target domain regression samples in the target domain dataset. The source domain dataset and the target domain dataset have the same number of channels for multi-channel sensor data, and the sensor data categories may be the same or different. S2. Source Domain Regression Model Training: The source domain regression model is trained using regression samples from each source domain. S3. Classification Prior Guidance: Cluster the wear value labels of the source domain dataset to obtain the classification labels of the source domain dataset. The classification labels include health and wear. The multi-channel sensor data and their corresponding classification labels corresponding to each processing step in the source domain dataset are used as the input and output data of the source domain classification samples, respectively, to train the source domain classification model. S4. Cross-domain classification feature alignment: Based on source domain classification samples and target domain classification samples, a distribution difference measurement algorithm is used to transfer the trained source domain classification model to the target domain to obtain the target domain classification model. The multi-channel sensor data corresponding to each processing step in the target domain dataset is used as the input data for the target domain classification sample; the target domain classification model is used to infer the input data of each target domain classification sample to obtain the classification label of the target domain classification sample. S5. Cross-domain preliminary segmentation and alignment regression features: Based on the classification samples with the same classification label in the source domain dataset and the target dataset, the distribution difference measurement algorithm is used to initially transfer the trained source domain regression model to the target domain to obtain the preliminary transferred target domain regression model. S6. Confidential pseudo-label screening: Use the target domain regression model of the preliminary transfer to extract features and infer the input data of each regression sample in the source domain dataset and the target domain dataset respectively, and obtain the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values accordingly. Based on the source domain regression features, source domain predicted values, target domain regression features, and target domain predicted values, reliable pseudo-labels for the target domain are selected from the target domain predicted values and used as reliable pseudo-labels for the target domain regression samples. The input data of each target domain regression sample and its corresponding target domain credible pseudo-label are used as the output data and input data of the candidate regression sample, respectively. S7. Cross-domain fine-alignment regression features: Based on the regression samples and candidate regression samples of each source domain, the distribution difference measurement algorithm is used to transfer the initially transferred target domain regression model to the target domain again to obtain the final target domain regression model. S8, Actual Reasoning Stage: Multi-channel sensor data is collected during the current target domain tool machining process and used as input data for the final target domain regression model to predict tool wear value, thus obtaining the tool wear value prediction result.
2. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, The implementation methods for preprocessing the data under the tool in the source domain and target domain in step S1 include: data filtering and cleaning.
3. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S4, the trained source domain classification model is transferred to the target domain to obtain the target domain classification model. The implementation method is as follows: Take any source domain classification sample from the source domain dataset and any target domain classification sample from the target dataset as a set of classification samples; The input data of the two classification samples in each group of classification samples are used as the input data of the trained source domain classification model, and the classification label of the source domain classification sample in the group of classification samples is used as the true value. While training the source domain classification model using source domain classification samples from each group of classification samples, a distribution difference metric algorithm is used to calculate and minimize the difference in classification features between the input data of two classification samples in each group of classification samples input to the trained source domain classification model, thereby obtaining the target domain classification model.
4. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S5, the trained source domain regression model is initially transferred to the target domain. The implementation method for obtaining the initially transferred target domain regression model is as follows: Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples, and the tool wear value label category of the two regression samples in this set of regression samples is the same. The input data of the two regression samples in each group of regression samples are used as the input data of the trained source domain regression model, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value. While training the source domain regression model using source domain regression samples from each group of regression samples, the distribution difference metric algorithm is used to calculate the difference in regression features between the input data of two regression samples in each group of regression samples input to the trained source domain regression model, and minimizes it to obtain the initial transfer target domain regression model.
5. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S6, the target domain regression model of the preliminary transfer is used to extract features and infer inference from the input data of each source domain regression sample in the source domain dataset. The implementation method for obtaining the source domain regression features and source domain predicted values is as follows: The target domain regression model of the initial transfer is used to infer the input data of each source domain regression sample, predict the corresponding source domain prediction value, and extract the source domain regression features in the inference process of the target domain regression model of the initial transfer. In step S6, feature extraction and inference are performed using the input data of each target domain regression sample in the target domain dataset of the initial transfer target domain regression model. The corresponding implementation method for obtaining the target domain regression features and target domain predicted values is as follows: The target domain regression model with preliminary transfer is used to infer the features of the input data of each target domain regression sample, predict the corresponding target domain prediction value, and extract the target domain regression features in the inference process of the target domain regression model with preliminary transfer.
6. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S6, the method for filtering out trustworthy pseudo-labels for the target domain from the predicted values of the target domain is as follows: S51. Divide all the target domain predicted values and their corresponding target domain regression features under each processing step into a set of target domain data; S52. Use all source domain regression features and source domain predicted values to filter a set of target domain data under each processing step, and select candidate pseudo-labels under that processing step. S53. Perform spatial clustering analysis on the candidate pseudo-labels under each processing step. When the clustering results have a main cluster and the proportion of candidate pseudo-labels in the main cluster exceeds the threshold, the average value of the candidate pseudo-labels in the main cluster is taken as the target domain credible pseudo-label of all target domain regression samples under that processing step.
7. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 6, characterized in that, S52. The method for filtering a set of target domain data under each processing step is as follows: Predicted values of the target domain that meet the screening criteria As an alternative pseudo-label; The filtering criteria are: ; ; in, The first step in the current processing sequence The predicted value of the target domain corresponding to each target domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. For the first in the source domain dataset The predicted source domain value corresponding to each source domain regression sample. To find the source domain predicted value among all source domain regression samples in the source domain dataset and the target domain predicted value. The source domain prediction value with the smallest difference. Predicted values for the target domain The target domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, Source domain predicted value The corresponding source domain regression features, This represents the total number of predicted values for all source domains in the source domain dataset.
8. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S7, the method for transferring the initially transferred target domain regression model back to the target domain is as follows: Take any one source domain regression sample from the source domain dataset and any one target domain regression sample from the target dataset as a set of regression samples; The input data of the two regression samples in each group of regression samples are used as the input data of the target domain regression model for the initial transfer, and the tool wear value label of the source domain regression sample in the group of regression samples is used as the true value. While training the target domain regression model using source domain regression samples from each group of regression samples, the algorithm of distribution difference measure is used to calculate and minimize the difference in regression features between the input data of two regression samples in each group of regression samples input to the target domain regression model.
9. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, In step S3, the wear value labels of the source domain dataset are clustered using the K-Means clustering algorithm.
10. The cross-domain prediction method for tool state based on segmented feature alignment and reliable pseudo-labels according to claim 1, characterized in that, The distribution difference measurement algorithm is implemented using the MMD algorithm.