Health state monitoring method for loading and unloading equipment based on multi-similar equipment shared learning
By using a shared learning method among multiple similar devices, the problems of data collection difficulties and insufficient model generalization in the health status monitoring of loading and unloading equipment were solved, enabling real-time and accurate monitoring and early warning of equipment health status, and ensuring the stable operation of the equipment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HARBIN ENG UNIV
- Filing Date
- 2024-07-29
- Publication Date
- 2026-06-23
AI Technical Summary
Traditional methods for monitoring the health status of loading and unloading equipment suffer from difficulties in data collection and insufficient model generalization capabilities, making it impossible to achieve real-time and effective monitoring. This results in the inability of loading and unloading equipment to operate continuously, stably, and efficiently.
A method based on shared learning of multiple similar devices is adopted. Through multi-source data acquisition, preprocessing, basic model initialization and adaptive learning, performance evaluation, global model fusion and health status prediction, sensor network, deep learning and transfer learning technologies are used to realize the health status monitoring of loading and unloading equipment.
It improves the accuracy and real-time performance of monitoring, ensures the continuous and stable operation of loading and unloading equipment, provides real-time early warnings and maintenance suggestions, and enhances equipment management efficiency.
Smart Images

Figure CN118861865B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of pneumatic marking machine equipment, and in particular to a method for monitoring the health status of loading and unloading equipment based on shared learning among multiple similar devices. Background Technology
[0002] Loading and unloading equipment plays a crucial role in the storage and retrieval of materials in dry bulk ports. However, traditional health monitoring methods often fall short in the face of complex mechanical structures, variable working environments, and stringent operational requirements, encountering significant challenges such as difficulties in data collection and insufficient model generalization ability. Currently, there is no effective real-time means to monitor the health status of loading and unloading equipment, making it impossible to ensure its continuous, stable, and efficient operation. Summary of the Invention
[0003] To overcome the aforementioned problems in the existing technology, this invention proposes a method for monitoring the health status of loading and unloading equipment based on shared learning among multiple similar devices.
[0004] The technical solution adopted by this invention to solve its technical problem is: a method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, comprising the following steps:
[0005] Step 1, Multi-source data acquisition and preprocessing: Real-time collection of operational data from multiple similar loading and unloading equipment, and preprocessing of the collected data;
[0006] Step 2, Basic Model Initialization and Adaptive Learning: The server initializes a basic model for each loading and unloading device. Each loading and unloading device performs adaptive learning of the basic model based on the data collected in Step 1, and updates the model parameters in real time by learning the gradient descent algorithm online.
[0007] Step 3, Performance Evaluation: The model described in Step 2 is trained using a multi-loss function strategy, and its performance is evaluated through cross-validation.
[0008] Step 4, Global Model Fusion: Each loading and unloading equipment uploads its basic model, which has undergone performance evaluation, to the server. The server uses a global model aggregation strategy based on transfer learning and knowledge distillation to fuse the basic models of multiple loading and unloading equipment into the global model. The server then introduces specific information from the local data of each loading and unloading equipment into the global model to obtain a personalized model for each loading and unloading equipment.
[0009] Step 5, Health Status Prediction: Apply the model obtained in Step 4 to the target loading and unloading equipment, and predict the health status of the loading and unloading equipment based on the real-time collected operating data.
[0010] Step 6, Early Warning and Maintenance: Based on the health status prediction results obtained in Step 5, the server sends early warning information, maintenance plans, and suggestions to the operators.
[0011] The above-mentioned method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, specifically step 1, is as follows:
[0012] Step 1.1: Using a sensor network and data acquisition system, collect real-time operating data from multiple similar loading and unloading equipment. The collected raw data is a set D = {d1, d2, ..., dn}, where each di contains information from multiple dimensions.
[0013] Step 1.2: Clean the data obtained in Step 1.1 by using an adaptive Kalman filter algorithm to remove duplicate, missing, or invalid data. The filtered data is represented as follows:
[0014]
[0015] Where F represents the filter function, This represents the filtered data, with the noise model being N(0,σ). 2 Further noise reduction is performed using a deep learning model, with the noise reduction model being M. denoise The denoised data is then represented as:
[0016]
[0017] in, This represents the denoised data; a deep learning model, an autoencoder, is used to extract features from the denoised data. The feature extraction model is M. feature The extracted features can then be represented as:
[0018]
[0019] Among them, f i Indicates the extracted features;
[0020] Step 1.3: Using an autoencoder, an unsupervised learning method, the most informative features closely related to the health status of the loading and unloading equipment are automatically discovered. The feature selection function is S, and the selected features are represented as follows:
[0021] F selected =S(f1,f2,...,f n )
[0022] Among them, F selected This represents the selected set of features that contains the most information.
[0023] The above-mentioned method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices uses a basic long short-term memory network model as the basic model in step 2. The basic long short-term memory network model includes an input layer, a hidden layer, and an output layer. The input layer receives the data obtained in step 1, and the output layer outputs a prediction of the health status of the loading and unloading equipment.
[0024] The above-mentioned method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, wherein the adaptive learning of the basic model in step 2 adopts online learning, specifically: the input data at time t is x t The corresponding real label is y t The basic model is based on the current parameter θ t and input data x t Calculate the predicted output Predicted output and real label y t The loss function between them is defined as The parameter update formula is as follows:
[0025]
[0026] Where η is the learning rate, which controls the step size of parameter updates; It is the gradient of the loss function with respect to the parameter θ.
[0027] The above-mentioned method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, specifically includes classification loss, regression loss, and distance constraint loss in step 3.
[0028] In the above-mentioned method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, step 4 specifically includes:
[0029] Step 4.1: Global Model Aggregation and Upload. Each loading and unloading device uploads its trained basic model parameters to the server. The basic model parameters of the i-th loading and unloading device are θ. i The server uses transfer learning and knowledge distillation to fuse knowledge from multiple base models into a global model, with the initial parameters of the global model being θ. G The global model parameters after aggregation are θ' G f i (x,θ i f is the predicted output of the i-th local model for input x. G (x,θ' G If ) represents the global model's predicted output for input x, then the loss function for knowledge distillation can be expressed as:
[0030]
[0031] Among them, w i The weights of the i-th base model are determined by the attention-based weight allocation method in step 4.2.
[0032] Step 4.2: p i s is the performance of the i-th local model on the validation set. ij If the similarity is between the i-th and j-th local models, then the weight w of the i-th local model is... i It can be calculated as follows:
[0033]
[0034] Here, α and β are hyperparameters used to adjust the weights of performance and similarity in the weight calculation;
[0035] Step 4.3: Global Model Update: By minimizing the loss function L of knowledge distillation KD The server updates the parameters θ' of the global model. G ;
[0036] Step 4.4: The local data for the i-th loading / unloading device is D. i Then the parameter θ' of the personalized model i This was achieved by adjusting local data based on the global model.
[0037]
[0038] Where L is the loss function, y i η is the true label of the local data, and η is the learning rate.
[0039] The above-described method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices, in step 5, employs a real-time feedback prediction result optimization mechanism. By collecting operational data and health status labels, the personalized model obtained in step 4 is updated and optimized in real time. The gradient descent algorithm is used for personalized model updating, and the update formula is expressed as:
[0040]
[0041] in These are the parameters before the model update. These are the updated parameters, where η is the learning rate and L is the loss function.
[0042] The beneficial effects of this invention are that it makes full use of the structural and functional similarities between loading and unloading equipment, and achieves in-depth mining and effective utilization of a large amount of monitoring data through shared learning, thereby greatly improving the accuracy and real-time performance of monitoring. Attached Figure Description
[0043] The present invention will be further described below with reference to the accompanying drawings and embodiments.
[0044] Figure 1 This is a schematic diagram of the device-shared learning structure of the present invention;
[0045] Figure 2 This is a diagram showing the experimental test results of the field data of this invention;
[0046] Figure 3 This is a schematic diagram showing the comparison results between the open-source dataset of this invention and other models. Detailed Implementation
[0047] To enable those skilled in the art to better understand the technical solution of the present invention, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
[0048] This embodiment discloses a method for monitoring the health status of loading and unloading equipment based on shared learning among multiple similar devices. Taking a stacker-reclaimer as an example, this method collects operational data from multiple similar stacker-reclaimers by deploying various sensors and data acquisition systems. Advanced data preprocessing, feature extraction, and model training techniques are then used to achieve real-time monitoring and early warning of the stacker-reclaimer's health status. The specific equipment shared learning structure model is as follows: Figure 1 As shown. The specific steps are as follows:
[0049] Step 1: Multivariate Data Acquisition and Preprocessing. Utilizing advanced sensor networks and data acquisition systems, real-time operational data from multiple similar stacker-reclaimers is collected, including key parameters such as temperature, vibration, current, and voltage. A hybrid preprocessing model based on adaptive filtering and deep learning is introduced to clean, filter, and denoise the raw data, eliminating noise and outliers while retaining crucial information. A deep learning model is then used to extract features from the preprocessed data, automatically identifying the most informative features closely related to the health status of the stacker-reclaimers through unsupervised learning.
[0050] Step 1.1: Utilize advanced sensor networks and data acquisition systems to collect real-time operating data from multiple similar stacker-reclaimers. Let the collected raw data be set D = {d1, d2, ..., dn}, where each di contains information in multiple dimensions, such as key parameters like temperature (Ti), vibration (Vi), current (Ii), and voltage (Ui).
[0051] Step 1.2: Clean the original data to remove duplicates, missing data, or invalid data. To eliminate noise and outliers, an adaptive Kalman filter algorithm is used. The filtered data can be represented as:
[0052]
[0053] Where F represents the filter function, This represents the filtered data, with the noise model being N(0,σ). 2 Furthermore, a deep learning model is used for noise reduction. Let the noise reduction model be M. denoise Then the denoised data can be represented as:
[0054]
[0055] in, This represents the denoised data. A deep learning model, an autoencoder, is used to extract features from the denoised data. Let the feature extraction model be M. feature The extracted features can then be represented as:
[0056]
[0057] Among them, f i This represents the extracted features.
[0058] Step 1.3: Using an autoencoder, an unsupervised learning method, automatically discover the most informative features closely related to the health status of the stacker-reclaimer. Assuming the feature selection function is S, the selected features can be expressed as:
[0059] F selected =S(f1,f2,...,f n )
[0060] Among them, F selected This represents the selected set of features that contains the most information. Thus, through data collection and preprocessing steps, a high-quality and information-rich dataset is obtained, laying a solid foundation for subsequent model training and health status monitoring.
[0061] Step 2: Basic Model Initialization and Adaptive Learning
[0062] The server initializes a base model for each stacker-reclaimer, which employs a long short-term memory network deep learning architecture to ensure training starts from a high-performance base. Each stacker-reclaimer adaptively learns based on local data and the base model, updating model parameters in real time through an online gradient descent algorithm to adapt to constantly changing operating conditions and health status.
[0063] Step 2.1: Initialize the base model for each stacker-reclaimer on the server. Here, a Long Short-Term Memory (LSTM) network is used as the deep learning architecture. LSTM is a special type of recurrent neural network (RNN) that can capture long-term dependencies in sequential data, making it well-suited for processing the time-series characteristics of stacker-reclaimer operation data. Assume the base LSTM model consists of an input layer, hidden layers (including LSTM units), and an output layer. The input layer receives preprocessed stacker-reclaimer data, and the output layer outputs a health status prediction for the stacker-reclaimer. Key components in the LSTM unit include the input gate, forget gate, output gate, and internal state. Let the base parameters of the LSTM model be θ, which are randomly generated during initialization or taken from the parameters of a pre-trained model.
[0064] Step 2.2: Each stacker-reclaimer uses local data and an initialized base LSTM model for adaptive learning. During training, an online learning approach is adopted, meaning that each time new data samples are received, they are immediately used to update the model.
[0065] Let the input data at time t be x. t The corresponding real label is y t (e.g., the health status level of a stacker-reclaimer). The LSTM model is based on the current parameter θ. t and input data x t Calculate the predicted output Predicted output and real label y t The loss function between them is defined as Common loss functions include mean squared error (MSE) or cross-entropy loss.
[0066] In online learning gradient descent methods (such as stochastic gradient descent SGD), model parameters are updated based on the gradient of the loss function. Specifically, the parameter update formula is as follows:
[0067]
[0068] Where η is the learning rate, which controls the step size of parameter updates; This is the gradient of the loss function with respect to the parameter θ. In this way, each stacker-reclaimer can adaptively learn based on local data and the initialized base LSTM model, updating the model parameters in real time to adapt to constantly changing operating conditions and health status.
[0069] Step 3: During local model training, a multi-loss function strategy is introduced, combining classification loss, regression loss, and distance constraint loss to comprehensively evaluate model performance and capture changes in the health status of the stacker-reclaimer during operation. Cross-validation and other methods are used to evaluate the performance of the local model, ensuring good generalization ability and accuracy on the local dataset.
[0070] Step 3.1: During local model training, in order to comprehensively evaluate the model's performance and capture changes in the health status of the stacker-reclaimer during operation, a multi-loss function strategy is introduced. This strategy combines classification loss, regression loss, and distance constraint loss.
[0071] Classification Loss: Assuming the health status of the stacker-reclaimer is divided into several discrete categories (e.g., normal, minor fault, severe fault, etc.), classification loss can be used to evaluate the model's performance on classification tasks. Commonly used classification losses include cross-entropy loss, the formula of which is:
[0072]
[0073] Where C is the number of categories, y c It is the one-hot encoding of the real label. It is the class probability predicted by the model.
[0074] Regression Loss: If the health status of the stacker-reclaimer can be represented by continuous values (such as remaining lifetime, health index, etc.), regression loss can be used to evaluate the model's performance on regression tasks. Common regression losses include Mean Squared Error (MSE), whose formula is:
[0075]
[0076] Where N is the number of samples, y i It is the actual value. These are model predictions.
[0077] Distance Constraint Loss: To ensure the model can capture changes in the health status of the stacker-reclaimer during operation, a distance constraint loss is introduced. This loss function aims to minimize some distance metric (such as Euclidean distance) between the predicted and actual values. Using Euclidean distance as the metric, the distance constraint loss can be expressed as:
[0078]
[0079] Here, D is the dimension of the feature. Finally, these three loss functions are weighted and combined to form a multiple loss function:
[0080] Ltotal =αL CE +βL MSE +γ LDC
[0081] Here, α, β, and γ are hyperparameters used to adjust the weights between different loss functions.
[0082] Step 3.2: During training, evaluate the performance of the local model using methods such as cross-validation. Cross-validation evaluates the model's performance on unseen data by dividing the dataset into training and test sets. K-fold cross-validation is used, dividing the dataset into K subsets. For each fold, K-1 subsets are selected as the training set, and the remaining subset is used as the validation set. This process is repeated K times in K-fold cross-validation, selecting a different subset as the validation set each time. Finally, the results of the K validations are averaged to obtain the overall performance evaluation of the model. This performance evaluation ensures that the model has good generalization ability and accuracy on the local dataset.
[0083] Step 4: Global Model Aggregation and Personalized Customization. Each stacker-reclaimer uploads its trained local model to the server. The server uses a global model aggregation strategy based on transfer learning and knowledge distillation to fuse the knowledge from multiple local models into the global model. During model aggregation, an attention-based weight allocation method is employed to dynamically adjust the weights of each local model based on its performance and similarity, resulting in a more accurate and robust global model. The server then customizes personalized models for each stacker-reclaimer. These models incorporate specific information from the local data into the global model to better adapt to the actual operating conditions and health status of different stacker-reclaimers.
[0084] Step 4.1: Global Model Aggregation and Upload. Each stacker-reclaimer uploads its trained local model parameters to the server. Assume the local model parameters of the i-th stacker-reclaimer are θ. i Global model aggregation based on transfer learning and knowledge distillation:
[0085] The server uses transfer learning and knowledge distillation to fuse knowledge from multiple local models into a global model. Assume the initial parameters of the global model are θ. G The global model parameters after aggregation are θ' G .
[0086] Transfer learning allows knowledge learned from one task to be applied to another. Here, the local model is used as the "teacher" model, and the global model as the "student" model. Knowledge distillation is achieved by minimizing the output difference between the teacher and student models. Let f i (x,θ if is the predicted output of the i-th local model for input x. G (x,θ' G If ) represents the global model's predicted output for input x, then the loss function for knowledge distillation can be expressed as:
[0087]
[0088] Among them, w i The weights of the i-th local model are determined by the attention-based weight allocation method described below.
[0089] Step 4.2: Weight Allocation Based on Attention Mechanism. To dynamically adjust the weights of each local model based on its performance and similarity, an attention-based weight allocation method is adopted. The core idea of the attention mechanism is to assign different weights to different types of information to highlight important information and suppress irrelevant information.
[0090] Let p i s is the performance of the i-th local model on the validation set. ij If the similarity between the i-th and j-th local models is (e.g., the cosine similarity of model parameters), then the weight w of the i-th local model is... i It can be calculated as follows:
[0091]
[0092] Here, α and β are hyperparameters used to adjust the weights of performance and similarity in the weight calculation.
[0093] Step 4.3: Global Model Update: By minimizing the loss function L of knowledge distillation KD The server updates the parameters θ' of the global model. G .
[0094] Step 4.4: The server customizes a personalized model for each stacker-reclaimer. These models incorporate local data-specific information on top of the global model to better adapt to the actual operating conditions and health status of different stacker-reclaimers. Let the parameters of the global model be θ'. G The local data for the i-th stacker-reclaimer is D. i Then the parameter θ' of the personalized model i This can be achieved by fine-tuning the local data based on the global model:
[0095]
[0096] Where L is the loss function, y i η is the true label of the local data, and η is the learning rate. Through personalized customization, a more accurate predictive model that fits the actual working conditions of each stacker-reclaimer can be provided.
[0097] Step 5: Health Status Prediction and Real-time Feedback. The customized model is applied to the target stacker-reclaimer, and health status is predicted based on real-time collected data. A prediction result optimization mechanism based on real-time feedback is introduced. By continuously collecting new operational data and health status labels, the customized model is updated and optimized in real time to improve the accuracy and reliability of the prediction results. The gradient descent algorithm is used for model updating, and the update formula can be expressed as:
[0098]
[0099] in These are the parameters before the model update. These are the updated parameters, where η is the learning rate and L is the loss function.
[0100] Step 6: Early Warning and Maintenance Recommendations. Based on the prediction results, the system sends early warning information to operators, reminding them to take timely maintenance measures so that operators can quickly locate problems and implement corresponding solutions. The system can automatically generate maintenance plans and recommendations to help operators better manage the stacker-reclaimer and ensure the stable operation of the production line. Simultaneously, the system also provides historical data analysis functions to help operators understand the stacker-reclaimer's operating trends and potential problems, enabling them to take preventative measures in advance.
[0101] Experimental verification:
[0102] A model was trained using the stacker-reclaimer dataset collected on-site, and a comparison chart of the training and test results was plotted (e.g., Figure 2 (As shown). The model converges rapidly in the early stages of training, and its performance gradually improves with each training iteration. Furthermore, the training results for different equipment data are comprehensively evaluated using metrics such as the area under the ROC curve (AUC). Compared with other mainstream algorithms, the proposed stacker-reclaimer health status monitoring method based on shared learning from multiple similar equipment shows significant advantages. In particular, when validating its practical application on a stacker-reclaimer dataset collected on-site, the model achieved an accuracy of 96%, higher than other models, demonstrating the effectiveness of this method in practical applications of gearbox health status monitoring.
[0103] Public dataset validation
[0104] To further validate the model's generalization ability, experiments were conducted on publicly available datasets such as the Case Western Reserve University dataset, the MFPT dataset, and the Paderborn University dataset (e.g., Figure 3(As shown). This table compares four different models (Resnet-50, Efficientnet, Few-shot Learning, and the model in this example) on three different datasets. These performance evaluation metrics include accuracy, recall, F1 score, and AUC (Area Under the Curve).
[0105] ResNet-50: On the Case Western Reserve University dataset, the model achieves an accuracy of 86.17%, a recall of 90.17%, an F1 score of 86.1%, and an AUC of 93.02%. On the MFPT dataset, its accuracy is relatively high at 91.27%, but its recall and F1 score are 93.37% and 91.02%, respectively, with an AUC of 93.16%. On the Paderborn University dataset, the model achieves an accuracy of 93.18%, a recall of 87.36%, an F1 score of 86.74%, and an AUC of 88.69%.
[0106] EfficientNet: On the Case Western Reserve University dataset, EfficientNet achieves an accuracy of 88.2%, but has a lower AUC of 86.8%. On the MFPT dataset, it has high accuracy and recall, but a relatively low F1 score and an AUC of 91.68%. On the Paderborn University dataset, EfficientNet achieves high accuracy and AUC of 94.36% and 92.89%, respectively, but has lower recall and F1 score.
[0107] Few-shot Learning: On the Case Western Reserve University dataset, this model achieves the highest accuracy (91.12%), but its recall and F1 score are relatively low. On the MFPT dataset, its performance is relatively poor, with both accuracy and recall being low. On the Paderborn University dataset, Few-shot Learning achieves slightly higher accuracy and F1 score than ResNet-50, but its recall and AUC are lower.
[0108] This embodiment model (in) Figure 3 (Referred to as Ours in Chinese): This embodiment's model demonstrated state-of-the-art performance on all three datasets. On the Case Western Reserve University dataset, it achieved an accuracy of 96.98%, a recall of 93.46%, and also high F1 scores and AUCs. On the MFPT and Paderborn University datasets, this embodiment's model also outperformed the other three models, demonstrating its generalization ability across various datasets.
[0109] In summary, the model in this embodiment achieved state-of-the-art performance on all three datasets, particularly demonstrating significant advantages in the key metrics of accuracy and AUC. This suggests that the model in this embodiment may possess more powerful feature extraction and classification capabilities, making it suitable for tasks on different datasets.
[0110] The above embodiments are merely exemplary embodiments of the present invention and are not intended to limit the present invention. Those skilled in the art can make various modifications or equivalent substitutions to the present invention within its scope and spirit, and such modifications or equivalent substitutions should also be considered to fall within the scope of protection of the present invention.
Claims
1. A method for monitoring the health status of loading and unloading equipment based on shared learning among multiple similar devices, characterized in that, Includes the following steps: Step 1, Multi-source data acquisition and preprocessing: Real-time collection of operating data from multiple similar loading and unloading equipment, including temperature, vibration, current, and voltage, and preprocessing of the collected data; Step 2, Basic Model Initialization and Adaptive Learning: The server initializes a basic model for each loading and unloading device. Each loading and unloading device performs adaptive learning of the basic model based on the data collected in Step 1, and updates the model parameters in real time by learning the gradient descent algorithm online. Step 3, Performance Evaluation: The model described in Step 2 is trained using a multi-loss function strategy, and its performance is evaluated through cross-validation. Step 4, Global Model Fusion: Each loading and unloading equipment uploads its basic model, which has undergone performance evaluation, to the server. The server uses a global model aggregation strategy based on transfer learning and knowledge distillation to fuse the basic models of multiple loading and unloading equipment into the global model. The server then introduces specific information from the local data of each loading and unloading equipment into the global model to obtain a personalized model for each loading and unloading equipment. Step 5, Health Status Prediction: Apply the model obtained in Step 4 to the target loading and unloading equipment, and predict the health status of the loading and unloading equipment based on the real-time collected operating data. Step 6, Early Warning and Maintenance: Based on the health status prediction results obtained in Step 5, the server sends early warning information, maintenance plans, and suggestions to the operators; Step 4 specifically involves: Step 4.1: Global Model Aggregation and Upload. Each loading and unloading device uploads its trained basic model parameters to the server. i The basic model parameters of each loading and unloading equipment are: The server utilizes transfer learning and knowledge distillation to fuse knowledge from multiple base models into a global model. The initial parameters of the global model are... The global model parameters after aggregation are , For the first i A local model for input x The predicted output, For the global model to input x If the predicted output is obtained, then the loss function of knowledge distillation is expressed as: ; in, w i It is the first i The weights of each basic model are determined by the attention-based weight allocation method in step 4.
2. Step 4.2: p i It is the first i Performance of a local model on the validation set s ij It is the first i The and the first j The similarity between the local models, then the first i Weights of each local model w i The calculation is as follows: ; in, α and β These are hyperparameters used to adjust the weights of performance and similarity in the weight calculation; Step 4.3: Global Model Update: By minimizing the loss function of knowledge distillation L KD The server updates the parameters of the global model. ; Step 4.4: The i Local data for each loading and unloading device is D i Then the parameters of the personalized model This was achieved by adjusting local data based on the global model. ; in, L It is a loss function. y i These are the actual labels of local data. η It is the learning rate.
2. The method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices according to claim 1, characterized in that, Step 1 specifically involves: Step 1.1: Utilize sensor networks and data acquisition systems to collect real-time operational data from multiple similar loading and unloading devices. The collected raw data forms a set. Each of them It contains information from multiple dimensions; Step 1.2: Clean the data obtained in Step 1.1 by using an adaptive Kalman filter algorithm to remove duplicate, missing, or invalid data. The filtered data is represented as follows: ; in, F Represents the filter function. This represents the filtered data, and the noise model is: Further noise reduction is performed using a deep learning model, which is as follows: The denoised data is then represented as: ; in, This represents the denoised data; a deep learning model, an autoencoder, is used to extract features from the denoised data. The feature extraction model is as follows: The extracted features are represented as follows: ; in, Indicates the extracted features; Step 1.3: Using an autoencoder, an unsupervised learning method, the most informative features closely related to the health status of the loading and unloading equipment are automatically discovered. The feature selection function is S, and the selected features are represented as follows: ; in, This represents the selected set of features that contains the most information.
3. The method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices according to claim 1, characterized in that, The basic model in step 2 is a basic long short-term memory network model, which includes an input layer, a hidden layer, and an output layer. The input layer receives the data obtained in step 1, and the output layer outputs a health status prediction of the loading and unloading equipment.
4. The method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices according to claim 3, characterized in that, In step 2, the adaptive learning of the basic model employs online learning, specifically: the input data at time t is... The corresponding real label is The base model is based on the current parameters. and input data Calculate the predicted output Predicted output and real labels The loss function between them is defined as The parameter update formula is as follows: ; in, It is the learning rate, which controls the step size for updating parameters; It is the loss function with respect to the parameters The gradient.
5. The method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices according to claim 1, characterized in that, The multi-loss function strategy in step 3 specifically includes classification loss, regression loss, and distance constraint loss.
6. The method for monitoring the health status of loading and unloading equipment based on shared learning of multiple similar devices according to claim 1, characterized in that, Step 5 employs a real-time feedback prediction result optimization mechanism. By collecting operational data and health status labels, the personalized model obtained in Step 4 is updated and optimized in real time. The gradient descent algorithm is used for personalized model updates, and the update formula is expressed as: ; in These are the parameters before the model update. These are the updated parameters. η L is the learning rate, and L is the loss function.