Method for real-time visual monitoring of bag mouth drop-off for ton bag feeding process
By integrating color and depth image features through an improved CNN architecture, combined with multi-view cameras and edge computing, real-time and high-precision monitoring of the bag opening status of ton bags is achieved. This solves the problem of monitoring bag opening detachment during ton bag feeding and improves the safety and economy of the production line.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGZHOU HENLL ELECTRONICS EQUIP CO LTD
- Filing Date
- 2026-03-05
- Publication Date
- 2026-06-19
AI Technical Summary
It is difficult to achieve high-precision, real-time monitoring of bag opening detachment during the feeding process of ton bags. Existing monitoring methods have problems such as low monitoring accuracy, slow response, inability to identify the semi-detached state, and lack of preventive control.
It adopts an improved CNN architecture, integrates features of two-dimensional color images and three-dimensional depth images, constructs multi-level state determination rules and hierarchical response mechanisms, realizes real-time and high-precision recognition of bag opening status through AI vision system, and combines edge computing and multi-view cameras for image acquisition and processing.
It achieves real-time, high-precision identification of the bag opening status, adapts to different specifications of ton bags, has strong anti-interference capabilities, forms a complete closed-loop monitoring system, reduces production costs, and improves the safety and economy of the production line.
Smart Images

Figure CN122244785A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent monitoring technology for industrial automation, and specifically to a real-time visual monitoring method for bag mouth detachment during the ton bag feeding process. Background Technology
[0002] BOOM (Flexible Intermediate Bulk Container) bags, also known as flexible transport containers, are flexible transport packaging containers used for loading large quantities of bulk materials. They have advantages such as large capacity, convenient loading and unloading, and reusability, and are widely used in material storage and transportation in industries such as chemicals, cement, grain, ore, and plastic granules. In industrial automated production lines, BOOM bag feeding is one of the core processes. The core process is as follows: a lifting mechanism suspends the BOOM bag filled with material above the feeding pipe, the bag opening is covered to the outside of the feeding pipe, and a fastening mechanism secures the bag opening to the feeding pipe. Then, the feeding drive is activated to transport the material inside the BOOM bag through the feeding pipe to the subsequent production process.
[0003] During the ton bag feeding process, the connection status between the bag opening and the discharge pipe directly determines the safety, stability, and economy of the production process. However, due to various factors, there is a certain probability that the bag opening of the ton bag will detach from the feeding pipe. The main reasons include: First, differences in ton bag materials. Some ton bag openings are made of flexible materials such as polypropylene, which are prone to elastic deformation and wear under the influence of material gravity and conveying vibration, leading to a decrease in the covering force between the bag opening and the feeding pipe. Second, complex production environments. Industrial production lines have problems such as high dust levels, strong vibrations, and uneven lighting. Traditional mechanical contact monitoring devices are easily clogged by dust and interfered with by vibrations, leading to monitoring failure. Third, diverse ton bag specifications. The diameter, thickness, and material of ton bags vary significantly between different industries and materials, making it difficult for existing monitoring devices to adapt to the monitoring needs of multiple ton bag specifications. Fourth, insufficient automation. In the traditional ton bag feeding process, the bag opening status mainly relies on manual visual monitoring. Manual monitoring suffers from strong subjectivity, high response delays, and high labor intensity, making it difficult to achieve 24-hour continuous monitoring. Especially in high-speed automated production lines, manual monitoring cannot detect abnormal bag opening detachment in a timely manner, which can easily lead to serious consequences.
[0004] The detachment or partial detachment of the bag opening can lead to a series of production problems: First, material leaks from the gap between the bag opening and the feed pipe, causing serious material waste and increasing production costs. Second, the leaked material is mostly powdery or granular, generating a large amount of dust, polluting the production environment, endangering the health of operators, and, when the dust reaches a certain concentration, can easily cause safety accidents such as explosions and fires, especially in the chemical industry, where the leakage of flammable and explosive materials poses a great safety hazard. Third, the detachment of the bag opening can cause material conveying to be interrupted, forcing the production line to stop, affecting production progress, and reducing production efficiency. Finally, the leaked material will accumulate on the production site, increasing the workload of on-site cleanup, and may even be caught in the transmission mechanism of the production line, causing equipment damage and further increasing production and maintenance costs.
[0005] To address the issue of monitoring bag opening detachment during ton bag feeding, various monitoring schemes have been proposed in existing technologies, mainly categorized into three types: mechanical contact monitoring, sensor monitoring, and visual monitoring.
[0006] Mechanical contact monitoring was the earliest method used. Its core principle is that a mechanical structure contacts the bag opening; when the bag opening detaches, the mechanical structure shifts, triggering a limit switch or travel switch to activate an alarm. This type of solution is simple in structure and low in cost, but it has significant limitations: firstly, the direct contact between the mechanical structure and the bag opening easily causes wear; secondly, it has poor adaptability; thirdly, it has weak anti-interference capabilities; and fourthly, it cannot monitor partial detachment states such as localized lifting or loosening.
[0007] Sensor monitoring solutions indirectly monitor the bag opening status by deploying various sensors, commonly including pressure sensors, tension sensors, infrared sensors, and ultrasonic sensors. Compared to mechanical contact solutions, sensor monitoring solutions reduce direct contact with the bag opening, but still have shortcomings: firstly, the monitoring dimensions are limited; secondly, they are susceptible to environmental interference; and thirdly, they cannot achieve anomaly location.
[0008] Visual monitoring solutions, based on machine vision technology, use industrial cameras to capture images of bag openings and employ image processing algorithms to identify the bag's condition. This represents a recent trend in industrial automation monitoring. While these solutions achieve non-contact monitoring and reduce damage to bag openings, existing visual monitoring solutions still face several technical bottlenecks: First, traditional image processing algorithms have poor robustness; second, the use of a single camera layout creates blind spots; third, the lack of depth information makes it difficult to distinguish between slight loosening and normal conditions; and fourth, a lack of intelligent response mechanisms.
[0009] With the rapid development of artificial intelligence technology, Convolutional Neural Networks (CNNs), as one of the core algorithms of deep learning, have made breakthroughs in image classification, object detection, semantic segmentation, and other fields, providing a new technical path for industrial visual monitoring. CNN algorithms possess powerful feature extraction and generalization capabilities, enabling them to automatically extract key features from complex industrial images and achieve high-precision state recognition. However, applying CNN algorithms to real-time monitoring of bag opening detachment in ton bag feeders still faces several technical challenges: First, industrial scenarios involve significant dust and lighting noise, requiring the design of interference-resistant CNN model architectures to improve the robustness of feature extraction; second, the feature differences in the partially detached state of the bag opening (such as partial lifting or slight loosening) are small, necessitating the construction of a refined feature fusion module to achieve accurate identification of minute anomalies; third, production lines have high real-time monitoring requirements, and traditional CNN models have high computational demands, making real-time inference difficult to achieve on edge computing devices; fourth, a complete closed-loop monitoring system needs to be built to achieve full-process collaboration from image acquisition and state recognition to execution control, adapting to the automation needs of industrial production lines. Summary of the Invention
[0010] The primary objective of this invention is to provide a real-time visual monitoring method for bag opening detachment during the feeding process of ton bags. This method is based on an improved CNN architecture, integrating two-dimensional color image features and three-dimensional depth image features to achieve real-time, high-precision identification of six states of the ton bag opening: complete coverage, partial lifting, loosening, slippage, complete detachment, and material overflow. By constructing multi-level state judgment rules and a graded response mechanism, it achieves early warning of abnormal states, automatic tightening, and emergency shutdown control, solving the problems of low monitoring accuracy, delayed response, inability to identify partial detachment states, and lack of preventive control in traditional monitoring methods.
[0011] To achieve the above-mentioned objectives, the present invention adopts the following technical solution, specifically including a real-time visual monitoring method for bag mouth detachment during the ton bag feeding process, and an AI visual monitoring system for implementing the method.
[0012] A real-time visual monitoring method for bag opening detachment during the feeding process of ton bags is applied to an automated ton bag feeding production line. The production line includes a ton bag lifting mechanism, a feeding pipe, a feeding drive device, and a bag opening fastening mechanism. The method is based on an AI vision system employing an improved CNN architecture. The method includes the following steps:
[0013] S100. Visual monitoring system initialization and parameter configuration.
[0014] S110. Hardware Initialization: Start the vision perception unit, edge computing unit, execution control unit and human-machine interaction unit, complete the communication handshake between the units, detect the working status of the vision lens, image acquisition card, industrial camera and light source module, and if there is a hardware fault, trigger a hardware alarm and stop the initialization process.
[0015] S120. Software Parameter Configuration: Monitoring task parameters are entered through the human-computer interaction unit, including ton bag specifications, feed pipe dimensions, production environment parameters, monitoring accuracy thresholds, anomaly response levels, and CNN model operating parameters. The ton bag specifications include bag opening diameter, bag opening material, bag opening color, and ton bag thickness. The feed pipe dimensions include feed pipe outer diameter, feed pipe length, feed pipe material, and feed pipe surface features. The production environment parameters include light intensity range, dust concentration threshold, shooting distance, and shooting angle. The monitoring accuracy thresholds include detachment judgment thresholds, semi-detachment judgment thresholds, warping judgment thresholds, slippage judgment thresholds, and material overflow judgment thresholds. The anomaly response levels include Level 1 response, Level 2 response, and Level 3 response. The CNN model operating parameters include batch size, learning rate, inference frame rate, and feature extraction dimension.
[0016] S130. Model Loading and Calibration: Load the pre-trained improved CNN bag state recognition model into the inference engine of the edge computing unit, input the standard calibration image set for model calibration, adjust the feature extraction weights and classification thresholds of the model, and ensure that the recognition accuracy of the model under the current production parameters meets the monitoring accuracy threshold requirements; if the calibration result is not up to standard, call the model fine-tuning module again for online fine-tuning until the calibration is successful.
[0017] S200. Visual perception unit layout and image acquisition preprocessing.
[0018] S210. Multi-view visual layout confirmation: Confirm that multiple sets of industrial cameras of the visual perception unit have been deployed in the feeding tube area according to the preset layout. Among them, the main camera is deployed directly above the feeding tube along the axis of the feeding tube to capture the global top view of the connection between the bag opening and the feeding tube; the side cameras are evenly deployed along the circumference of the feeding tube, with a deployment quantity of 2-6 units, to capture the local side view of the connection between the bag opening and the feeding tube; the laser supplementary light camera is deployed on the outside of the feeding tube to capture the depth information map of the connection between the bag opening and the feeding tube; the shooting range of each camera overlaps with each other to form a visual monitoring area without blind spots;
[0019] S220. Adaptive Light Source Control: The light source module automatically switches between supplementary lighting modes, including constant light mode, strobe mode, and laser structured light mode, based on the light intensity in the production environment parameters. When the light intensity in the production environment is lower than a preset threshold, the constant light mode is activated for ambient supplementary lighting. When there is high-speed motion interference, the strobe mode is activated, and the strobe frequency is synchronized with the image acquisition frame rate. When it is necessary to obtain the three-dimensional shape information of the bag opening, the laser structured light mode is activated to project laser stripes onto the surface of the bag opening.
[0020] S230. Real-time image acquisition: Each industrial camera acquires images according to a preset acquisition frame rate. The main camera's acquisition frame rate is 30-60fps, the side cameras' acquisition frame rate is consistent with the main camera, and the laser supplementary lighting camera's acquisition frame rate is 15-30fps. The acquired image data includes a global color image, a local side-view color image, and a depth information image. Each image data carries a timestamp, camera number, and acquisition parameter label, and is transmitted to the edge computing unit in real time through a high-speed data interface.
[0021] S240. Image Preprocessing: The edge computing unit performs real-time preprocessing on the received image data, sequentially completing image denoising, image enhancement, image registration, image cropping, and image normalization operations to obtain a standardized feature image, specifically including:
[0022] S241. Image Denoising: An adaptive median filtering algorithm combined with a wavelet thresholding algorithm is used to remove dust noise, illumination noise, and electronic noise from the image. For the global color image and the local side-view color image, salt-and-pepper noise is removed by adaptive median filtering and Gaussian noise is removed by wavelet thresholding. For the depth information map, a bilateral filtering algorithm is used to remove depth noise while preserving depth edge features.
[0023] S242. Image Enhancement: To address the issues of uneven lighting and low contrast between the bag opening and the feeding tube in industrial scenarios, the CLAHE (Constrained Contrast Adaptive Histogram Equalization) algorithm is used to equalize the brightness of the color image, and the Laplacian operator is combined to enhance the edge features of the image; for the depth information map, the histogram stretching algorithm is used to improve the dynamic range of the depth value and highlight the depth difference between the bag opening and the feeding tube.
[0024] S243. Image Registration: Based on the feature point matching algorithm, the local side-view color images acquired by multiple side cameras are registered with the global color image acquired by the main camera, and the depth information map acquired by the laser supplementary lighting camera is registered with the color image at the pixel level to obtain the fused multimodal image; among them, the feature point matching algorithm uses the SIFT (Scale Invariant Feature Transform) algorithm to extract image feature points, completes feature point matching through FLANN (Fast Nearest Neighbor Search), and realizes perspective transformation and registration of the image based on the homography matrix;
[0025] S244. Image Cropping: Based on the preset monitoring area ROI (Region of Interest), the registered multimodal image is cropped to remove irrelevant background areas, retaining only the core monitoring areas of the feeding pipe and the bag opening, reducing the computational load of model inference; the coordinates of the ROI area are automatically generated based on the feeding pipe size parameters and camera layout parameters, and manual fine-tuning is supported;
[0026] S245. Image Normalization: The cropped image is scaled to a preset size, the main camera image is scaled to 512×512 pixels, the side camera image is scaled to 256×256 pixels, and the depth information map is scaled to 512×512 pixels; the image pixel values are normalized to the [0,1] range, and normalization is performed according to the RGB channel and the depth channel respectively to obtain a standardized feature image, which is stored in the cache area of the edge computing unit.
[0027] S300. Training and optimization of an improved CNN bag opening state recognition model.
[0028] S310. Sample Dataset Construction: Construct a sample dataset of ton bag opening status, including a training set, a validation set, and a test set, with the sample data divided in a 7:2:1 ratio; the samples in the sample dataset include images of bag opening status under different ton bag specifications, different feed pipe sizes, and different production environments, along with corresponding annotation information. The annotation information includes category labels for bag opening complete coverage, partial lifting, loosening, slippage, complete detachment, and material overflow, as well as the bounding box coordinates, anomaly type, and severity level of abnormal areas at the bag opening;
[0029] S311. Sample Collection: Sample images were collected through actual ton bag feeding production lines, covering all scenarios including normal production, slight bag opening lifting, moderate bag opening loosening, severe bag opening slippage, complete bag opening detachment, and material spillage; at the same time, data augmentation technology was used to generate expanded samples, including random flipping, random cropping, random rotation, lighting changes, noise addition, and color jitter, expanding the sample dataset to more than 100,000 images;
[0030] S312. Sample Labeling: A combination of manual and semi-automatic labeling methods was used, with the LabelMe labeling tool applied to the sample images. For bag opening status categories, six categories were used: "Normal," "Upturned," "Loose," "Slipped," "Falling Off," and "Overflow." For abnormal areas, the bounding box coordinates and the severity of the anomaly were labeled, with severity categorized into three levels: "Slight," "Moderate," and "Severe." After labeling, the labeled data was validated, erroneous and duplicate labels were removed, and a labeling file in VOC or COCO format was generated.
[0031] S320. Improved CNN Model Architecture Design: The improved CNN bag opening status recognition model includes a feature extraction module, a multi-scale feature fusion module, an anomaly classification module, an anomaly localization module, and a deep feature enhancement module. The model is based on the ResNet50 architecture and is improved to meet the multi-feature recognition requirements of ton bag opening status. The specific architecture is as follows:
[0032] S321. Feature Extraction Module: Based on the improved ResNet50, the fully connected layers of the original ResNet50 are removed, while convolutional layers, batch normalization layers, and activation function layers are retained. Deformable convolutional kernels are introduced into the convolutional layers to replace the traditional fixed convolutional kernels. The deformable convolutional kernels can adaptively adjust the sampling position of the convolutional kernels according to the shape changes of the bag opening, improving the ability to extract irregular features such as bag opening protrusions and wrinkles. At the same time, an SE (squeeze and excitation) attention mechanism is added to each residual block. By allocating weights to the feature channels, the key features of the connection area between the bag opening and the feed tube are highlighted, and background noise features are suppressed.
[0033] S322. Multi-scale Feature Fusion Module: This module fuses feature maps of different scales output by the feature extraction module, including shallow, mid-scale, and deep feature maps. Shallow feature maps correspond to detailed features such as edges and textures in the image; mid-scale feature maps correspond to local features such as the outline and shape of the bag opening; and deep feature maps correspond to global features such as the overall state and category of the bag opening. A combination of lateral connectivity and pyramid pooling is used to upsample, stitch, and pool the multi-scale feature maps to generate a fused feature map with a dimension of 2048.
[0034] S323. Deep Feature Enhancement Module: The preprocessed depth information map is input into an independent deep feature extraction sub-network. This sub-network adopts a lightweight CNN architecture, including 3 convolutional layers, 2 batch normalization layers and 1 pooling layer, to extract deep feature vectors. The deep feature vectors are then concatenated with the fused feature map output by the multi-scale feature fusion module to obtain multi-modal fusion features, which makes up for the shortcomings of color images in the 3D shape recognition of bag openings.
[0035] S324. Anomaly Classification Module: Multimodal fusion features are input into the fully connected layer, with the number of neurons in the fully connected layer being 1024, 512, and 6 respectively. Finally, the probability distribution of the six states of the bag opening is output through the Softmax activation function to achieve classification and recognition of the bag opening state. At the same time, a classification loss function is added, using the cross-entropy loss function combined with the focus loss function to solve the imbalance problem between normal state samples and abnormal state samples in the sample data.
[0036] S325. Anomaly Localization Module: Based on the YOLOv8 detection head architecture, it performs convolution operations on multimodal fusion features and outputs the bounding box coordinates, confidence scores, and anomaly types of the anomaly regions; the anomaly localization module uses the CIoU (Complete Intersection over Union) loss function to improve the accuracy of anomaly region localization;
[0037] S330. Model training process:
[0038] S331. Training Environment Setup: Set up a model training environment based on the PyTorch framework. The hardware used is an NVIDIA Tesla V100 GPU, and the software used is Python 3.8, CUDA 11.2, and cuDNN 8.0.
[0039] S332. Training parameter settings: The initial learning rate is set to 0.001, and a cosine annealing learning rate scheduling strategy is adopted, with the learning rate gradually decreasing with each training epoch; the batch size is set to 32, and the number of training epochs is set to 100; the AdamW optimizer is used, with a weight decay coefficient of 0.0001.
[0040] S333. Model Training: Input the training set samples into the improved CNN model, perform forward propagation to obtain classification and localization results, calculate the loss function value, and update the model's weight parameters through backpropagation; every 5 training epochs, use validation set samples to validate the model, and calculate the model's classification accuracy, recall, and mAP (mean precision); if the validation set accuracy does not improve for 10 consecutive epochs, adopt an early stopping strategy to stop training and avoid model overfitting;
[0041] S334. Model Fine-tuning: Input the test set samples into the trained model. If the model's recognition accuracy does not reach the monitoring accuracy threshold, select the erroneous samples from the test set and add them to the training set for secondary training. Adjust the model's classification threshold and attention mechanism weights until the model's test set classification accuracy is ≥99.5% and the anomaly localization mAP is ≥99.0%.
[0042] S340. Lightweight Model Optimization: To adapt to the deployment requirements of edge computing units, lightweight optimization is performed on the trained improved CNN model, including model pruning, model quantization, and knowledge distillation.
[0043] S341. Model pruning: A structured pruning algorithm is used to remove convolutional kernels and neurons with low contribution from the model. The pruning ratio is 30%-50%, while retaining the model's core feature extraction capabilities.
[0044] S342. Model Quantization: Quantize the model's 32-bit floating-point parameters into 16-bit floating-point or 8-bit integer parameters to reduce the model's storage capacity and computational load, thereby improving inference speed;
[0045] S343. Knowledge Distillation: Using the high-precision model that has been trained as the teacher model and the lightweight model as the student model, the knowledge of the teacher model is transferred to the student model through the distillation loss function, thereby improving the inference speed of the lightweight model by 2-3 times while ensuring the accuracy of the model.
[0046] S400. Real-time identification and determination of bag opening status based on an improved CNN model.
[0047] S410. Feature Image Input: The edge computing unit inputs the preprocessed standardized feature images in batches into the inference engine of the lightweight improved CNN bag opening state recognition model. The inference engine is accelerated by TensorRT to achieve real-time inference.
[0048] S420. Model Inference Calculation: The model performs feature extraction, multi-scale fusion, deep feature enhancement, anomaly classification, and anomaly localization calculations on the input feature image, and outputs the probability distribution of the bag opening state, the bounding box coordinates of the anomaly region, the confidence level, the anomaly type, and the severity level; among them, in the probability distribution of the bag opening state, the category corresponding to the highest probability is the current main state of the bag opening; the confidence level of the anomaly region must be greater than the preset confidence threshold (≥0.95), otherwise it is judged as no anomaly;
[0049] S430. Multi-level determination of bag opening status: Based on the model output and a preset monitoring accuracy threshold, the bag opening status is determined in multiple levels, including normal status, abnormal warning status, and fault shutdown status. The specific determination rules are as follows:
[0050] S431. Normal state determination: If the main state output by the model is "normal" and there are no abnormal area detection results, or the confidence of the abnormal area is lower than the confidence threshold, it is determined that the bag opening is in a normal state of fully covering the feed tube.
[0051] S432. Abnormal Warning Status Judgment: The main status output by the model is "lifting", "loosening" or "slipping", and the severity of the abnormality is "slight" or "moderate", which is judged as an abnormal warning status where the bag opening is partially detached; among them, "slight lifting" means that the lifting height of the bag opening edge is ≤5mm and the lifting range is ≤1 / 8 of the bag opening circumference; "moderate loosening" means that the gap between the bag opening and the feeding tube is 3-8mm and the loosening range is ≤1 / 4 of the bag opening circumference; "slight slipping" means that the bag opening slides ≤10mm along the axial direction of the feeding tube;
[0052] S433. Fault Shutdown Status Determination: The bag opening is determined to be in a fault shutdown state if any of the following conditions are met: ① The main status output by the model is "detachment", that is, the bag opening is completely detached from the feed pipe; ② The main status output by the model is "lifting", "loosening" or "slipping", and the severity of the abnormality is "severe"; ③ The main status output by the model is "overflow", that is, material is detected leaking outward from the gap between the bag opening and the feed pipe; Among them, "severe lifting" means that the lifting height of the bag opening edge is >5mm and the lifting range is >1 / 8 of the bag opening circumference; "severe loosening" means that the gap between the bag opening and the feed pipe is >8mm and the loosening range is >1 / 4 of the bag opening circumference; "severe slipping" means that the bag opening slides a distance >10mm along the axial direction of the feed pipe; "material overflow" means that the outline of the detected material particles occupies ≥0.5% of the monitored area.
[0053] S440. Judgment Result Storage and Update: The edge computing unit associates and stores the judgment result, model output data, timestamp and camera number of each frame image in JSON format. At the same time, it updates the real-time status cache and retains the monitoring data of the most recent 5 minutes for subsequent traceability and analysis.
[0054] S500. Graded response and execution control for abnormal states.
[0055] S510. Response Level Matching: Based on the determination result of the bag opening status, match the preset abnormal response level, where the abnormal warning status corresponds to the level 2 response and the fault shutdown status corresponds to the level 1 response; at the same time, set the level 3 response to the manual review response, which is used for abnormal confirmation in special scenarios.
[0056] S520. Level 1 Response (Fault Shutdown Response): When a fault shutdown state is determined, the edge computing unit immediately sends a Level 1 response command to the execution control unit. The execution control unit completes the following actions according to the preset priority, with an action execution timing error ≤50ms:
[0057] S521. Emergency Stop Control: Sends a stop signal to the feeding drive device to immediately stop the ton bag feeding process and cut off the material conveying power; sends a locking signal to the ton bag hoisting mechanism to lock the hoisting position of the ton bag and prevent the ton bag from continuing to fall or shake.
[0058] S522. Audible and visual alarm trigger: Control the audible and visual alarm device to activate the first-level alarm, the red warning light flashes continuously, the buzzer sounds continuously at a frequency of 1Hz, and the alarm information is simultaneously pushed to the human-machine interaction unit and the production line central control system.
[0059] S523. Abnormal Information Reporting: The time of the downtime, the type of abnormality, the image of the abnormal area, the judgment basis, and the workstation information of the production line are reported in real time to the production line central control system and the enterprise MES (Manufacturing Execution System) to generate a fault alarm work order;
[0060] S524. Actuator Locking: Send a locking signal to the bag opening fastening mechanism to prevent the bag opening fastening mechanism from performing any action until the lock is manually released;
[0061] S530. Level 2 Response (Anomaly Warning Response): When an anomaly warning state is detected, the edge computing unit sends a Level 2 response command to the execution control unit, which then performs the following actions:
[0062] S531. Warning prompt trigger: Control the sound and light alarm device to activate the second-level alarm, the yellow warning light flashes at a frequency of 2Hz, and the buzzer sounds intermittently at a frequency of 2Hz; in the real-time monitoring interface of the human-machine interaction unit, the abnormal area is marked with a yellow border, and the abnormality type and severity are displayed;
[0063] S532. Production deceleration control: Send a deceleration signal to the feeding drive device to reduce the feeding speed to 50%-70% of the normal speed, reduce the amount of material conveyed, and reduce the risk of abnormal conditions worsening;
[0064] S533. Tightening command issuance: Send an adaptive tightening command to the bag opening fastening mechanism. The bag opening fastening mechanism adjusts the tightening force and tightening position according to the boundary box coordinates of the abnormal area and the abnormality type. For the warping abnormality, control the corresponding fastening jaws to press down and tighten. For the loosening abnormality, control the circumferential fastening band to tighten. For the slippage abnormality, control the axial positioning mechanism to reset.
[0065] S534. Real-time tracking and monitoring: The AI vision system maintains high-frequency acquisition and inference to track the bag opening status after tightening in real time; if the bag opening status returns to normal within 10 seconds after tightening, the level 2 response is automatically lifted and normal production speed is restored; if the bag opening status is still in an abnormal warning state within 10 seconds after tightening, or is upgraded to a fault shutdown state, the system is immediately switched to level 1 response.
[0066] S540. Level 3 Response (Manual Review Response): A Level 3 response is triggered when the following situations occur: ① The anomaly confidence level of the model output is between 0.90 and 0.95, and cannot be clearly determined; ② The production line operator manually initiates a review request through the human-machine interaction unit; ③ The same abnormal state lasts for more than 30 seconds and the remedial action is ineffective.
[0067] S541. Review prompt push: Push abnormal images, judgment results and review requests to the handheld terminal and human-machine interaction unit of the on-site operator to prompt the operator to conduct on-site review;
[0068] S542. Manual Intervention Operation: Based on the actual situation on site, the operator sends operation instructions through a handheld terminal or human-machine interaction unit, including "confirm normal", "confirm abnormal", "manual stop", and "manual tightening"; the execution control unit executes the corresponding actions according to the manual instructions and updates the response record;
[0069] S550. Response Result Feedback: The execution control unit feeds back the execution status, execution time and execution result of all response actions to the edge computing unit in real time. The edge computing unit associates and stores the response result with the judgment result to form a complete closed-loop record of exception handling.
[0070] S600. Data management, model iteration and system maintenance.
[0071] S610. Monitoring Data Management: The edge computing unit manages monitoring data in layers, including real-time cached data, short-term storage data, and long-term archived data. Real-time cached data is retained for the most recent 5 minutes for real-time monitoring and response. Short-term storage data is retained for 30 days and stored on the local hard drive of the edge computing unit for daily production traceability. Long-term archived data is retained for more than 1 year and transmitted to the enterprise cloud server via industrial Ethernet for big data analysis. The data management module supports data query, data export, data deletion, and data backup functions. Query conditions include time, workstation, anomaly type, and response level.
[0072] S620. Online Iterative Optimization of the Model: Based on long-term archived monitoring data and anomaly handling records, a model iteration sample set is constructed, and the improved CNN bag opening state recognition model is periodically iterated and optimized online, specifically including:
[0073] S621. Iterative Sample Screening: Each month, samples that the model misidentifies, newly added abnormal scenarios, and frequently occurring abnormal samples are screened from the archived data and added to the model iteration sample set;
[0074] S622. Model Retraining: The existing model is retrained using the model iteration sample set to adjust the model's weight parameters and classification threshold, and update the model's feature extraction capabilities.
[0075] S623. Model Version Update: The iteratively optimized model is marked with a version number, and the model management module of the edge computing unit enables seamless switching between the old and new models without affecting the normal operation of the production line.
[0076] S630. System Routine Maintenance and Self-Check:
[0077] S631. Periodic self-check: During the daily production line shutdown period in the early morning, the system automatically performs a full-process self-check, including hardware self-check, software self-check, model self-check and communication self-check; after the self-check is completed, a self-check report is generated, and if a problem is found, a maintenance alarm is triggered;
[0078] S632. Hardware maintenance prompts: Based on the runtime and maintenance cycle of each hardware device, the system automatically generates hardware maintenance prompts, including camera lens cleaning, light source replacement, edge computing unit heat dissipation check, and actuator lubrication, etc.
[0079] S633. Software Upgrade: The system supports remote online upgrades. Software update packages are pushed through the enterprise cloud server, and the edge computing unit automatically completes the software upgrade. After the upgrade is completed, functional verification is performed.
[0080] S640. Human-Machine Interaction and Report Generation: The human-machine interaction unit displays the bag opening status, monitoring data, response records, and equipment status of the production line in real time. It supports real-time viewing of abnormal images, historical data curves, and fault statistics reports. The system automatically generates daily, weekly, and monthly production monitoring reports, including key indicators such as the number of abnormal occurrences, the distribution of abnormal types, response execution efficiency, and model recognition accuracy, providing data support for the optimization and management of the production line.
[0081] Furthermore, in step S321, the sampling offset of the deformable convolutional kernel is predicted by the convolutional neural network, specifically including: performing a convolution operation on the feature map output by the feature extraction module to generate an offset field. The offset field contains the x-direction and y-direction offsets of each convolutional kernel sampling point. The sampling point adjusts its sampling position according to the offset field to achieve adaptive feature extraction for irregular shapes of the bag opening.
[0082] Furthermore, in step S533, the bag opening fastening mechanism includes a circumferential fastening belt, an axial positioning mechanism, and multiple sets of distributed fastening grippers; the circumferential fastening belt adopts a servo motor driven synchronous belt structure, and the fastening force can be adaptively adjusted within the range of 0-500N; the axial positioning mechanism includes an electric push rod and a positioning chuck, with a positioning accuracy of ±0.5mm; the number of distributed fastening grippers is consistent with the number of side cameras deployed, and they can independently complete the pressing and clamping actions.
[0083] Furthermore, in step S620, the online iterative optimization of the model also includes cross-production line transfer learning: the iterative sample sets of multiple ton bag feeding production lines with different specifications are merged to construct a cross-scenario sample set, and the transfer learning algorithm is used to train the model to improve the model's generalization ability in different production scenarios.
[0084] Regarding the real-time visual monitoring method for bag opening detachment during the ton bag feeding process of the present invention; as described in any of the above schemes, the method is applied to an automated ton bag feeding production line, and is implemented based on an AI vision system with an improved CNN architecture. The core steps include system initialization and parameter configuration, visual perception and image preprocessing, training and optimization of the improved CNN model, real-time identification and judgment of bag opening status, hierarchical response and execution control of abnormal status, and data management and system maintenance.
[0085] During system initialization, hardware self-checks, parameter input, and model calibration are completed to ensure the system is in normal working condition. In the image acquisition and preprocessing phase, color and depth images of the bag opening are acquired using multi-view cameras. These images undergo denoising, enhancement, registration, cropping, and normalization to obtain standardized feature images. In the model training phase, a large-scale sample dataset is constructed, and an improved CNN model incorporating deformable convolution, SE attention mechanism, multi-scale fusion, and deep feature enhancement is designed. Through training, fine-tuning, and lightweight optimization, an inference model meeting real-time and accuracy requirements is obtained. In the state recognition phase, feature images are input into the model to classify, locate, and determine the severity of the bag opening state, establishing a three-level state determination rule. In the response control phase, a graded response mechanism is matched based on the determination results to achieve automatic tightening, production slowdown, emergency shutdown, and manual review, forming a closed loop for anomaly handling. In the data management and system maintenance phase, hierarchical storage of monitoring data, online model iteration, and daily system self-checks are implemented to ensure long-term stable system operation.
[0086] Beneficial effects. Compared with the prior art, the present invention has the following significant beneficial effects:
[0087] 1. High monitoring accuracy and strong real-time performance. Utilizing an improved CNN architecture, it integrates multimodal features from 2D color images and 3D depth images, introducing deformable convolutional kernels and SE attention mechanisms to enhance the ability to identify minor anomalies at the bag opening (slight lifting, moderate loosening), resulting in high monitoring accuracy. Accelerated by edge computing units combined with the TensorRT inference engine, inference latency is low, enabling real-time identification and response to bag opening status, thus solving the problems of low accuracy and slow response in traditional monitoring methods.
[0088] 2. Strong adaptability and outstanding anti-interference capability. The system hardware adopts a modular and adjustable design. The camera bracket and bag mouth fastening mechanism can be adaptively adjusted according to the ton bag specifications and feed pipe size to adapt to different industries and ton bag feeding scenarios of different specifications. In the image preprocessing stage, multiple noise reduction and enhancement algorithms are used, combined with a dust protection module, to effectively resist interference from dust, vibration, uneven lighting and other factors in industrial scenarios, ensuring monitoring stability.
[0089] 3. High degree of intelligence and closed-loop control. A three-level state judgment rule of "normal-early warning-fault" and a graded response mechanism of "level one-level-level-level three" are constructed to achieve automatic early warning, automatic emergency response, emergency shutdown, and manual verification of abnormal states, forming a complete closed loop of "perception-identification-judgment-response-optimization"; the online iterative optimization mechanism of the model can continuously improve the model's generalization ability based on actual production line operating data, without the need for frequent manual parameter adjustments.
[0090] 4. Non-contact monitoring and reduced production costs. Utilizing AI visual monitoring technology, direct contact with the bag opening is eliminated, preventing bag wear and extending bag lifespan. It replaces traditional manual monitoring, significantly reducing labor intensity and costs, while also minimizing material waste, equipment damage, and environmental costs caused by bag opening detachment, thus improving the production line's economic efficiency.
[0091] 5. High compatibility and engineering practicality. The system adopts a standardized communication protocol, which can be seamlessly integrated into existing automated ton bag feeding production lines without the need for large-scale modifications to the production line. The design of redundant backup modules and dust protection modules enhances the reliability of the system in harsh industrial environments. It can be widely used in multiple industries such as chemical, building materials, food, and mining, and has extremely strong engineering practicality and promotional value.
[0092] To better understand and implement this application, the following detailed description is provided in conjunction with the accompanying drawings. Attached Figure Description
[0093] Figure 1 This is an overall flowchart of the real-time visual monitoring method for bag mouth detachment during the ton bag feeding process of the present invention;
[0094] Figure 2 This is a schematic diagram of the architecture of the improved CNN bag opening state recognition model of the present invention. Detailed Implementation
[0095] In the description of this application, it should be understood that the terms "center," "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating orientation or positional relationships based on the orientation or positional relationships shown in the accompanying drawings, are used only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation on this application. In the description of this application, unless otherwise stated, "a plurality of" means two or more.
[0096] The core inventive concept of this invention lies in addressing the technical pain points of bag opening status monitoring during ton bag feeding. It involves deeply integrating an improved CNN deep learning algorithm with industrial machine vision technology to construct a full-process monitoring system encompassing "multi-view perception, multi-modal fusion, refined recognition, hierarchical response, and closed-loop optimization." Specifically, this includes the following core concepts:
[0097] First, the concept of multi-view + multi-modal visual perception: breaking through the limitations of existing single-camera two-dimensional visual monitoring, a multi-view layout of "main camera + side camera + laser supplementary camera" is adopted to achieve blind-spot-free monitoring of the global and local states of the bag opening; at the same time, multi-modal information of two-dimensional color images and three-dimensional depth images are integrated. Color images are used to extract the texture, color and contour features of the bag opening, while depth images are used to extract three-dimensional morphological features such as the covering gap, sliding distance and lifting height of the bag opening and the feeding tube, to make up for the lack of information of single-modal images and improve the recognition accuracy of abnormal states.
[0098] Second, the improved CNN feature extraction concept: Based on the ResNet50 architecture, deformable convolutional kernels and SE attention mechanism are introduced to improve the model's adaptive feature extraction capability for irregular shapes of bag openings (such as upturns and wrinkles), highlighting key features and suppressing noise features; a multi-scale feature fusion module is constructed to achieve the fusion of shallow detailed features, mid-level local features and deep global features, solving the problem of difficulty in identifying small abnormal features; a deep feature enhancement module is added to splice deep features with color features to form multi-modal fusion features, improving the accuracy of state determination.
[0099] Third, the control concept of multi-level judgment + graded response: Based on the severity of bag opening abnormalities, a three-level judgment rule is constructed: "normal state - abnormal warning state - fault shutdown state". The semi-detachment state is further subdivided into three levels: slight, moderate and severe. Matched with a graded execution mechanism of "secondary response - primary response - tertiary response", automatic tightening and production slowdown are realized for slight abnormalities, emergency shutdown and alarm are realized for severe abnormalities, and manual review is realized for ambiguous abnormalities, forming a closed-loop management of "prevention-control-disposal" to avoid the deterioration of abnormal states.
[0100] Fourth, the intelligent optimization concept of edge computing + online iteration: edge computing units are used to realize localized computing for image preprocessing and model inference, combined with TensorRT inference engine acceleration to meet the real-time requirements of the production line; an online iterative optimization mechanism for the model is constructed, based on historical monitoring data of the production line, error samples and new samples are regularly screened, and the model is retrained and updated to improve the model's generalization ability; at the same time, through the big data management module, the hierarchical storage and analysis of monitoring data is realized, providing data support for the optimization of the production line.
[0101] Fifth, a highly adaptable and highly reliable system design concept: The system hardware adopts a modular and adjustable design. The camera bracket and fastening mechanism can be adaptively adjusted according to the ton bag specifications and the size of the feeding pipe to adapt to multiple application scenarios; a dust protection module and a redundant backup module are set up to improve the reliability of the system in harsh industrial environments; a standardized communication protocol is adopted, which can be seamlessly integrated into the existing production line's PLC and MES systems, reducing the difficulty of engineering transformation.
[0102] This embodiment uses an automated ton bag feeding production line of a chemical enterprise as an application scenario. The production line is mainly used to transport powdered chemical raw materials. The ton bags are made of polypropylene. The production environment has a high dust concentration and the light intensity fluctuates between 500-2000 lux. The response delay for bag opening detachment is required to be ≤100ms, and the timing error of abnormal response actions is required to be ≤100ms.
[0103] 1. System setup and initialization.
[0104] 1.1 Hardware Deployment: Following the structure of the AI visual monitoring system described in this invention, each unit is deployed. In the visual perception unit, the main camera is deployed directly above the unloading pipe axis, employing a global shutter color industrial camera; four side cameras are deployed, symmetrically distributed at 90°, also employing global shutter color industrial cameras; a laser supplementary lighting camera is deployed outside the unloading pipe, employing a structured light depth industrial camera; in the light source module, a ring-shaped LED constant-on light source is deployed below the main camera lens, high-frequency strobe LED light sources are deployed beside each side camera, and the laser structured light source is integrated with the laser supplementary lighting camera; the camera mounting bracket uses an alloy adjustable bracket, the main camera bracket height is adjusted to 800mm, and the laser supplementary lighting camera tilt angle is adjusted to 30°; the air purging device automatically purifies once every 10 minutes.
[0105] The edge computing unit uses an embedded industrial server; the execution control unit is equipped with a high-speed counting module, an analog input / output module, and a digital input / output module. The servo driver is matched with the servo motors of the feeding drive device and the bag mouth fastening mechanism. The audible and visual alarm device uses red, yellow, and green warning lights and an industrial buzzer. The bag mouth fastening mechanism includes a circumferential fastening belt, an axial positioning mechanism, and two sets of distributed fastening grippers. The human-machine interaction unit uses an interactive display screen. The communication unit uses a gigabit industrial Ethernet switch, a Profinet communication module, a Modbus TCP communication module, and an RS485 communication module to realize communication between each unit and the production line central control system and the enterprise system. The redundant backup module deploys a backup edge computing unit and a backup PLC controller. The backup communication link uses industrial Ethernet or 5G communication for mutual backup.
[0106] 1.2 Software Parameter Configuration: Monitoring task parameters are entered through the human-computer interaction unit. Ton bag specifications: bag opening diameter, bag opening material (polypropylene), bag opening color (gray), ton bag thickness; Feed pipe dimensions: feed pipe outer diameter, feed pipe length, feed pipe material (stainless steel), surface without obvious texture; Production environment parameters: light intensity range 500-2000 lux, dust concentration threshold 1.5 mg / m³, shooting distance 800 mm, shooting angle 0° (main camera), 90° / 180° / 270° / 0° (side camera). Camera), 30° (laser supplementary camera); Monitoring accuracy thresholds: detachment judgment threshold (bag opening completely detached from the feed pipe), partial detachment judgment threshold (tilting / loosening / slipping), tilting judgment threshold (slight ≤5mm, severe >5mm), slipping judgment threshold (slight ≤10mm, severe >10mm), material overflow judgment threshold (material particle outline ≥0.5% of the monitored area); Abnormal response levels: Level 1 response (fault shutdown), Level 2 response (abnormal warning), Level 3 response (manual verification).
[0107] 1.3 Model Loading and Calibration: The pre-trained improved CNN bag opening state recognition model (optimized by pruning, quantization, and knowledge distillation) is loaded into the inference engine of the edge computing unit. A standard calibration image set of 1000 images (containing images of different bag opening states of the production line) is input for model calibration. The feature extraction weights and classification thresholds of the model are adjusted. After calibration, the model recognition accuracy is improved and meets the monitoring accuracy threshold requirements, thus completing the initialization.
[0108] 2. Image acquisition and preprocessing.
[0109] 2.1 Multi-view image acquisition: The main camera, four side cameras, and the laser supplementary camera start acquisition according to the preset frame rate; the main camera acquires a global top view of the connection between the bag opening and the feeding tube, the side cameras acquire local side views of the bag opening in four directions around it, and the laser supplementary camera acquires depth information maps of the bag opening and the feeding tube; each image data carries a timestamp, camera number, and acquisition parameter label, and is transmitted to the edge computing unit in real time through the PCIe interface industrial image acquisition card.
[0110] 2.2 Adaptive Light Source Control: The light source module automatically switches the supplementary lighting mode according to the light intensity of the production environment. When the light intensity is below 500 lux, the ring LED constant light source is activated for supplementary lighting. When the production line vibrates and causes the bag opening to shake slightly (high-speed motion interference), the high-frequency strobe LED light source is activated, and the strobe frequency is synchronized with the camera acquisition frame rate. When it is necessary to obtain three-dimensional morphological information such as the height of the bag opening and the covering gap, the laser structured light mode is activated to project laser stripes onto the surface of the bag opening.
[0111] 2.3 Image Preprocessing: The edge computing unit performs real-time preprocessing on the received image data, completing the following operations in sequence:
[0112] (1) Image denoising: Adaptive median filtering combined with wavelet threshold denoising algorithm is used to remove salt and pepper noise and Gaussian noise in global color image and local side view color image; bilateral filtering algorithm is used to remove depth noise in depth information image and retain depth edge features;
[0113] (2) Image enhancement: The CLAHE algorithm is used to perform brightness equalization processing on the color image, and the Laplacian operator is used to enhance the edge features; the histogram stretching algorithm is used to improve the dynamic range of the depth information map and highlight the depth difference between the bag opening and the feed tube;
[0114] (3) Image registration: The SIFT algorithm is used to extract the feature points of each image, and the FLANN is used to complete the feature point matching. Based on the homography matrix, the local side view color images of the four side cameras are registered with the global color image of the main camera, and the depth information map is registered with the color image at the pixel level to obtain the fused multimodal image.
[0115] (4) Image cropping: Based on the preset ROI area (the core area of the feeding tube and bag opening, the coordinates are automatically generated according to the size of the feeding tube and the camera layout), the registered multimodal image is cropped to remove irrelevant background areas and reduce the amount of model inference calculation.
[0116] (5) Image normalization: The cropped main camera image is scaled to 512×512 pixels, the side camera image is scaled to 256×256 pixels, and the depth information map is scaled to 512×512 pixels; the image pixel values are normalized to the [0,1] range, and processed according to the RGB channel and the depth channel respectively to obtain a standardized feature image, which is stored in the cache of the edge computing unit.
[0117] 3. Bag opening status recognition and determination.
[0118] 3.1 Model Inference: The edge computing unit inputs the preprocessed standardized feature images into the inference engine of the lightweight improved CNN bag opening state recognition model in batches. Through TensorRT to accelerate inference, the model performs feature extraction, multi-scale fusion, deep feature enhancement, anomaly classification and anomaly localization calculation on the feature images, and outputs the probability distribution of bag opening state, the coordinates of the boundary box of the anomaly region, confidence, anomaly type and severity level.
[0119] 3.2 Status Determination: Based on the model output and the preset monitoring accuracy threshold, a three-level status determination is performed:
[0120] (1) Normal state: When the model outputs the main state as "normal" and there are no abnormal area detection results, or the confidence of the abnormal area is <0.95, it is determined that the bag mouth completely covers the feeding tube and is in a normal state. The system maintains the normal production rhythm.
[0121] (2) Abnormal warning status: When the main output status of the model is "lifting", "loosening" or "slipping", and the severity of the abnormality is "slight" or "moderate", it is judged as an abnormal warning status. For example, if the edge of the bag opening is detected to be lifted by 3mm and the lifting range is 1 / 10 of the bag opening circumference, it is judged as slight lifting; if the gap between the bag opening and the feed tube is detected to be 5mm and the loosening range is 1 / 5 of the bag opening circumference, it is judged as moderate loosening; if the bag opening is detected to slip along the feed tube axis by 8mm, it is judged as slight slipping.
[0122] (3) Fault shutdown state: When the main output state of the model is "detachment", or the main state is "lifting", "loosening" or "slipping" and the severity is "severe", or the main state is "overflow", it is judged as a fault shutdown state. For example, if the edge of the bag opening is detected to be lifted by 6mm and the lifting range is 1 / 7 of the bag opening circumference, it is judged as severe lifting; if the gap between the bag opening and the feed pipe is detected to be 9mm and the loosening range is 1 / 3 of the bag opening circumference, it is judged as severe loosening; if the bag opening is detected to slide 12mm along the feed pipe axis, it is judged as severe sliding; if the material particle outline is detected to occupy 0.6% of the monitored area, it is judged as material overflow; if the bag opening is detected to be completely detached from the feed pipe, it is judged as detachment.
[0123] 3.3 Result Storage: The edge computing unit stores the judgment result, model output data, timestamp and camera number of each frame of image together, and updates the real-time status cache to retain the monitoring data of the most recent 5 minutes for subsequent traceability and analysis.
[0124] 4. Graded response and execution for abnormal states.
[0125] 4.1 Level 2 Response (Anomaly Warning Response): When an anomaly warning state is detected, the edge computing unit sends a level 2 response command to the execution control unit, which then performs the following actions:
[0126] (1) Warning prompt: The sound and light alarm device is activated to start the second-level alarm. The yellow warning light flashes at a frequency of 2Hz and the buzzer sounds intermittently at a frequency of 2Hz. The human-machine interaction unit monitors the abnormal area in real time with a yellow border, and displays the abnormal type (such as "slight lifting") and severity.
[0127] (2) Production slowdown: Send a slowdown signal to the feeding drive device to reduce the feeding speed from 10m³ / h to 6m³ / h (60% of the normal speed), reduce the amount of material conveyed, and reduce the risk of abnormal deterioration;
[0128] (3) Adaptive tightening: Send a tightening command to the bag opening fastening mechanism and adjust the action according to the coordinates of the boundary box of the abnormal area and the abnormality type: For slight lifting, control the distributed fastening claws at the corresponding position to press down and tighten (clamping force 50N); for moderate loosening, control the circumferential fastening band to tighten; for slight slippage, control the axial positioning mechanism to reset.
[0129] (4) Real-time tracking: The AI vision system maintains high-frequency acquisition and inference, and tracks the bag opening status after tightening in real time; if the bag opening status returns to normal within 10 seconds after tightening, the secondary response is automatically released and the normal feeding speed is restored; if the abnormal warning status is still within 10 seconds after tightening, or if it is upgraded to a fault shutdown status, it is immediately switched to the primary response.
[0130] 4.2 Level 1 Response (Fault Shutdown Response): When a fault shutdown state is determined, the edge computing unit immediately sends a Level 1 response command to the execution control unit. The execution control unit then performs the following actions according to priority:
[0131] (1) Emergency stop: Send a stop signal to the feeding drive device to immediately stop the feeding process and cut off the material conveying power; send a locking signal to the ton bag hoisting mechanism to lock the ton bag hoisting position and prevent the ton bag from falling or shaking.
[0132] (2) Audible and visual alarm: The audible and visual alarm device is activated to start the first-level alarm. The red warning light flashes continuously, the buzzer sounds continuously, and the alarm information is pushed to the human-machine interaction unit, the production line central control system and the operator's handheld terminal simultaneously.
[0133] (3) Anomaly reporting: The downtime, anomaly type, anomaly area image, judgment criteria and workstation information are reported to the production line central control system and the enterprise MES system in real time to generate an alarm work order;
[0134] (4) Mechanism locking: Send a locking signal to the bag opening fastening mechanism to prevent it from making any movement until the locking is manually released.
[0135] 4.3 Level 3 Response (Manual Review Response): A Level 3 response is triggered when the model output anomaly confidence level is between 0.90 and 0.95, the operator manually initiates a review request, or the same anomaly persists for more than 30 seconds and remedial actions are ineffective.
[0136] (1) Review prompt: Push the abnormal image, judgment result and review request to the operator's handheld terminal and human-computer interaction unit to prompt the operator to review on site;
[0137] (2) Manual intervention: After the operator checks the actual status of the bag opening on site, he / she sends operation instructions through a handheld terminal or human-machine interaction unit, such as "confirm normal", "confirm abnormal", "manual stop", "manual tightening"; the execution control unit executes the corresponding action according to the manual instruction and updates the response record.
[0138] 4.4 Response Feedback: The execution control unit feeds back the execution status, execution time and execution result of all response actions to the edge computing unit in real time. The edge computing unit associates and stores the response result with the judgment result to form a complete closed-loop record of anomaly handling.
[0139] 5. Data management, model iteration, and system maintenance.
[0140] 5.1 Data Management: The edge computing unit manages monitoring data in layers. Real-time cached data is used for real-time monitoring and response. Short-term storage data is stored on local industrial-grade solid-state drives for daily traceability. Long-term archived data is transmitted to the enterprise cloud server via industrial Ethernet for big data analysis. The data management module supports querying data by time, workstation, anomaly type, and response level, and supports data export, deletion, and backup functions.
[0141] 5.2 Online Model Iterative Optimization: Each month, samples with model recognition errors, newly added abnormal scenarios, and high-frequency abnormal samples are selected from long-term archived data and added to the model iteration sample set. The existing model is retrained using the iteration sample set, and the model weight parameters and classification threshold are adjusted. The iteratively optimized model is version-marked, and the seamless switching between the old and new models is achieved through the model management module, without affecting the normal operation of the production line. At the same time, the iteration sample sets of the company's other three ton bag feeding production lines with different specifications are integrated, and the transfer learning algorithm is used to train the model to improve the model's generalization ability.
[0142] 5.3 System Daily Maintenance and Self-Check: During the daily production line downtime in the early morning, the system automatically performs a full-process self-check, including hardware self-check, software self-check, model self-check, and communication self-check. After the self-check is completed, a self-check report is generated. If problems such as camera failure, light source damage, or communication abnormality are found, a maintenance alarm is triggered. Based on the runtime and maintenance cycle of each hardware device, maintenance prompts are automatically generated, such as cleaning camera lenses, replacing light sources, checking the heat dissipation of edge computing units, and lubricating actuators. The system supports remote online upgrades, pushing software update packages through the enterprise cloud server. The edge computing unit automatically completes the upgrade and performs functional verification.
[0143] 5.4 Report Generation: The human-machine interaction unit displays the bag opening status, monitoring data, response records, and equipment status of the production line in real time. It supports viewing abnormal images, historical data curves, and fault statistics reports. The system automatically generates daily, weekly, and monthly monitoring reports, including key indicators such as the number of abnormal occurrences, the distribution of abnormal types, response execution efficiency, and model recognition accuracy, providing data support for production line optimization and management.
[0144] The embodiments described above are merely examples of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these modifications and improvements all fall within the protection scope of this application.
Claims
1. A method for real-time visual monitoring of bag mouth drop-off for a ton bag feeding process, characterized in that, The method is applied to a ton bag automatic feeding production line, the production line comprising a ton bag hoisting mechanism, a discharging pipe, a feeding driving device and a bag opening fastening mechanism; the method is realized based on an AI vision system adopting an improved CNN architecture, and comprises the following steps: S100. Initialization and parameter configuration of a vision monitoring system; S200. A multi-view image acquisition unit acquires images and pre-processes the acquired images to obtain standardized feature images; S300. An improved CNN bag opening state recognition model is constructed and trained; the model is improved based on a ResNet50 architecture, introduces a deformable convolution kernel and an SE attention mechanism, and fuses depth information features; S400. The standardized feature images are input into the trained improved CNN bag opening state recognition model, real-time output of bag opening state probability distribution and abnormal area positioning information, and three-level judgment of bag opening state based on a preset threshold value; S500. According to the result of the three-level judgment, corresponding graded responses and control actions are executed, the graded responses including a first-level fault shutdown response, a second-level abnormal early warning response and a third-level manual review response; S600. Monitoring data management and model online iterative optimization.
2. The method of real-time visual monitoring of bag mouth fall-off for a totes bag feeding process of claim 1, wherein, In the step S300, the structure of the improved CNN bag opening state recognition model comprises: a feature extraction module for extracting multi-scale features of an image, wherein a deformable convolution kernel is used in a convolution layer to adapt to irregular shapes of a bag opening, and an SE attention mechanism is embedded in a residual block to suppress background noise; a multi-scale feature fusion module for fusing shallow detail features, middle local features and deep global features; a depth feature enhancement module for extracting a depth feature vector of the depth information map and concatenating the depth feature vector with the fusion feature map output by the multi-scale feature fusion module to generate a multi-modal fusion feature; an abnormal classification and positioning module for outputting a bag opening state category, an abnormal area bounding box coordinate and a confidence level based on the multi-modal fusion feature.
3. The method of real-time visual monitoring of bag mouth drop-off for a totes bag feeding process of claim 2, wherein, The specific action process of the SE attention mechanism is as follows: The spatial dimension is compressed by a global average pooling operation to obtain a global feature description at the channel level; A fully connected layer is used to build the dependency relationship between channels to generate weight coefficients for each channel; The weight coefficients are multiplied with the original feature map channel by channel to recalibrate the feature response, thereby enhancing the channel weight of key features such as the bag opening edge, wrinkles and gaps, while suppressing the noise weight of background dust and device texture.
4. The method of real-time visual monitoring of bag mouth drop-off for a totes bag feeding process of claim 3, wherein, The rules of the three-level judgment in the step S400 are as follows: Normal state, the model output main state is normal, or the abnormal area confidence is lower than the preset confidence threshold; Abnormal early warning state, the model output main state is lifted, loosened or slipped, and the abnormal severity is determined to be slight or moderate; wherein slight lifting refers to a lifting height of less than or equal to 5mm, and moderate loosening refers to a bag covering gap of 3-8mm; In the case of a fault shutdown, the main output state of the model is complete detachment, or the main state is lifting, loosening, or sliding with the severity of the abnormality judged as severe, or the main state is material overflow; among them, severe lifting refers to a lifting height greater than 5mm, and severe loosening refers to a covering gap greater than 8mm.
5. The method for real-time visual monitoring of bag mouth fall-off for a totes bag feeding process of claim 3, wherein, The graded response and control actions in step S500 specifically include: When the system is determined to be in a fault shutdown state, a Level 1 response is executed: immediately send a shutdown signal to cut off the power of the feeding drive device, lock the ton bag hoisting mechanism, trigger an audible and visual alarm, and lock the bag mouth fastening mechanism. When an abnormal warning state is determined, a level two response is executed: triggering a warning prompt, controlling the feeding drive device to decelerate to 50%-70% of the normal speed, and sending an adaptive tightening command to the bag mouth fastening mechanism. The adaptive tightening command adjusts the pressing position of the fastening claws or tightens the circumferential fastening band according to the coordinates of the abnormal area. If the bag mouth state is not restored within a preset time, a level one response is executed. When the model output confidence level is in the fuzzy range or the same anomaly persists for more than a preset time, a three-level response is executed: a review request is pushed to the human-computer interaction terminal, and corresponding operations are performed according to the manual intervention instructions.
6. The method of real-time visual monitoring of bag mouth fall-off for a totes bag feeding process as claimed in claim 4, wherein, The online iterative optimization of the model in step S600 specifically includes: Periodically filter and identify erroneous samples, newly added scene samples, and high-frequency abnormal samples from historical monitoring data to construct an iterative dataset; The improved CNN bag opening state recognition model was retrained and its weights fine-tuned using the iterative dataset. The fine-tuned model is versioned and seamlessly switched to the new version during production line downtime or low-load periods.
7. The method for real-time visual monitoring of bag mouth fall-off for a totes bag feeding process of claim 3, wherein, The method also includes an edge computing deployment step: The trained improved CNN bag opening state recognition model is lightweighted, including model pruning and quantization operations; The lightweight model was deployed on an industrial edge computing server using the TensorRT inference engine. Through the above deployment, the inference latency of a single frame image is less than or equal to 100ms, which meets the real-time requirements of the ton bag feeding process.
8. The method for real-time visual monitoring of bag mouth drop off for ton bag feeding process as claimed in claim 3 wherein, It also includes dust protection steps: Equipped with an air purging device, high-pressure gas purging is activated periodically or as needed to clean the camera lens and light source surface in the vision sensing unit to remove adhering dust.
9. The method for real-time visual monitoring of bag mouth drop off for ton bag feeding process as claimed in claim 3 wherein, The parameter configuration in step S100 includes adaptive illumination adjustment: During the initialization phase, the adaptive light source module is controlled to conduct test shots at different brightness levels; Analyze the histogram distribution and signal-to-noise ratio of the test images, and automatically select the light source brightness parameters that make the gradient of the bag opening edge clearest and without reflection or overexposure as the working parameters under the current working conditions.
10. The method for real-time visual monitoring of bag mouth drop-off for a totes bag feeding process according to any one of claims 3-9, characterized in that, It also includes dynamic stabilization steps to address dynamic shaking conditions during the lifting of ton bags; The dynamic stabilization process includes: In the preprocessing stage of step S200, the small displacement vector of the camera is estimated using feature matching technology of consecutive multi-frame images; Electronic image stabilization compensation is performed on the current frame image based on the displacement vector to eliminate image blurring and position shift caused by ton bag shaking or mechanical vibration. The current frame image is input into the improved CNN bag state recognition model for inference only when the image sharpness index is higher than the preset threshold and the motion blur is lower than the preset limit; otherwise, the frame is discarded and the valid judgment result of the previous frame is used.