Packaging defect detection method, system and device based on YOLO model and medium
By improving the defect detection method of the YOLOv11 model, the problems of insufficient detection capability for multiple types of defects and poor material adaptability in the defect detection of edible oil packaging have been solved. High-precision, anti-interference, and real-time defect detection effect has been achieved, which is suitable for edible oil production lines.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI STRATOSPHERE INFORMATION TECH CO LTD
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies for detecting defects in edible oil packaging suffer from several problems, including insufficient ability to detect small defects, limited ability to detect multiple types of defects, weak ability to distinguish subtle differences, poor material adaptability, and insufficient model robustness. In particular, the accuracy rate is low and the false detection rate is high under complex lighting conditions.
A packaging defect detection method based on the YOLOv11 model is adopted. It corrects reflection by adaptive histogram equalization, embeds edge information to enhance the front-end module, constructs a multi-scale feature pyramid, enhances the detail capture of the detection head, and optimizes the loss function to achieve multi-scale extraction and adaptive weight allocation of defect features, thereby improving detection accuracy and robustness.
It has achieved a significant improvement in the detection accuracy of various types of defects in edible oil packaging, with an overall defect detection accuracy of ≥93% and a recall rate of ≥91%. The anti-interference ability has been significantly enhanced, the detection speed is ≥30FPS, and the error has been reduced from ≥3mm to ≤1mm, meeting the needs of real-time industrial detection.
Smart Images

Figure CN122243979A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of food packaging quality inspection technology, specifically relating to an automated detection method for multiple types of defects in edible oil packaging. It is applicable to real-time detection of defects in key parts such as oil drums, bottle caps, labels, and liquid levels in edible oil production lines. It can quickly identify and locate defects such as label errors, abnormal liquid levels (high / low), holes, internal foreign objects, and damaged bottle caps, providing technical support for food packaging quality control. In particular, it is a packaging defect detection method, system, device, and medium based on the YOLO11 model. Background Technology
[0002] Existing technologies related to the detection of defects in edible oil packaging mainly include the following patents and products: Patent CN220181955U (An Online Bottle Cap Inspection Machine for Edible Oil): This patent uses a sorting and feeding mechanism and three cameras to detect the bottle cap's QR code, height, and production code respectively. It replaces the traditional drop-feeding method with a smooth displacement feeding method, solving the problem of bottle caps easily falling off during feeding. This device mainly targets bottle cap inspection and does not cover the inspection of other packaging parts such as the bottle body, label, or liquid level. Furthermore, it uses a fixed template matching method, limiting its adaptability to different batches of products.
[0003] Patent CN116500052A (A Visual Detection System for Edible Oil and its Application Method): This patent acquires images of the bottom of an oil bottle using a dynamic visual sensor. Combining bilateral filtering, morphological processing, and image algebraic operations, the processed image is input into a YOLO network for detection. The system can detect impurities as small as 0.2 mm in diameter. This method primarily targets the detection of internal impurities in the oil and has limited ability to detect packaging defects (such as label misalignment and bottle cracks).
[0004] Patent CN117831026A (Food Quality Inspection Method and Device Based on Machine Learning): This patent is based on the YOLO-v5 model. It collects historical data on food quality inspection, publicly available datasets, and other relevant data, integrates and annotates this data, and then inputs it into the YOLO-v5 model for training to achieve food quality inspection. Although this method uses deep learning technology, it is primarily designed for food quality inspection and lacks adaptability to the special materials (transparent, reflective) of edible oil packaging, and it is not specifically optimized for various types of packaging defects.
[0005] The existing technology has the following technical defects: Insufficient detection capability for minor defects: Existing technologies such as CN220181955U mainly employ traditional template matching and image processing techniques, resulting in low detection accuracy for minor defects such as tiny holes and minor defects in bottle caps, with a recall rate of only 40-60% for small target defects. Although patent CN116500052A can detect impurities as small as 0.2mm, it mainly targets impurities inside liquids and has limited ability to detect minor defects on the packaging appearance.
[0006] Limitations in multi-type defect detection capabilities: Existing patents mostly target the detection of single-type defects. For example, CN220181955U mainly detects bottle cap defects, and CN116500052A mainly detects internal impurities. They lack the ability to simultaneously detect multiple defects such as label misalignment, abnormal liquid level, and bottle cracks. When multiple defects coexist, feature interference is severe.
[0007] Weak ability to distinguish subtle differences: Existing technologies struggle to differentiate between subtle differences such as liquid level variations or slight label misalignments, resulting in low accuracy in identifying minor defects. Traditional methods primarily rely on threshold segmentation and template matching, which are insensitive to subtle visual differences and cannot meet the demands of high-precision detection.
[0008] Poor material adaptability: Edible oil packaging materials are transparent and reflective. Although existing technologies such as CN117831026A use the YOLO-v5 model, they have not been specifically optimized for the characteristics of packaging materials. Under complex lighting conditions, feature extraction is insufficient, and the detection accuracy will decrease.
[0009] Insufficient model robustness: Existing deep learning-based methods such as CN117831026A, although using YOLO-v5, have not been trained for complex scenarios such as label errors. The model has low tolerance for label noise and has a high false detection rate when encountering label styles not covered by the training data. Summary of the Invention
[0010] To address the aforementioned technical problems, this invention provides a packaging defect detection method, system, apparatus, and medium based on the YOLO model, comprising: S1, acquiring sample images and corresponding defect labels, standardizing the sample images, and constructing a packaging image dataset; wherein the packaging image dataset includes at least packaging images, defect ground truth bounding boxes, and label types; S2, constructing a packaging defect detection model based on YOLO11, extracting edge feature maps and spatial feature maps of the packaging images, fusing the edge feature maps and spatial feature maps to obtain enhanced feature maps, and performing multi-scale feature extraction to obtain multi-scale feature maps C2 to C5; S3, processing the multi-scale feature maps C2 to C5... C5 performs boundary similarity calculation and adaptive weight allocation to obtain recalibrated multi-scale feature maps P2 to P6; wherein, the multi-scale feature map P2 is a newly added small defect feature map, and the defects in P2 to P6 are arranged from small to large; S4, the recalibrated multi-scale feature maps P2 to P6 are enhanced in detail, and the confidence of the defects and the coordinate offset of the defect bounding boxes are output through classification and regression; S5, the similarity between the defect bounding boxes and the true defect bounding boxes is obtained through localization loss, and the packaging defect detection model is iteratively trained until convergence and training is completed; S6, the packaging image to be detected is input into the trained packaging defect detection model and the result is output.
[0011] Step S1 includes: correcting the reflection of the packaging image using adaptive histogram equalization, and outputting the corrected packaging image. The correction formula is:
[0012] in, For the input image in The grayscale value of the coordinates. and These are the minimum and maximum gray values of a local region of the image, respectively. This is a rounding function. To output the image in The grayscale value of the coordinates; Construct a defect labeling system, storing the actual bounding box coordinates and label type of each defect in XML format; The packaged image dataset is divided into a training set, a validation set, and a test set, and data augmentation is performed on the training set images; wherein... The label types include at least: label error, high liquid level, low liquid level, hole, internal foreign matter, and damaged bottle cap.
[0013] Step S21 includes: S2.1, Insert an edge information enhancement front-end module at the input end of the backbone network; S2.2, edge detection is performed on the input packaging image using the Sobel operator, with the horizontal and vertical Sobel convolution kernels being respectively... and ; S2.3, Calculate the edge feature map of the packaging image. The formula is:
[0014]
[0015] in, For convolution operations, and These are the horizontal and vertical edge feature maps, respectively. S2.4, Extract the spatial features of the packaging image to form the spatial feature map. The formula is:
[0016] in, Initialize the convolution kernels randomly to 3×3. For activation function, For batch normalization operation; S2.5, the edge feature map With the spatial feature map The features are concatenated along the channel dimension and a fused feature map is output. The formula is: ,in, This is for channel splicing operations. It is a 1×1 convolutional layer; S2.6, the fused feature map The input core feature extraction module performs four downsampling and feature aggregation operations to output the multi-scale feature maps C2 to C5.
[0017] The steps in S3 include: S3.1, Upsample the multi-scale feature maps C2 to C5 to obtain magnified feature maps. The formula is:
[0018] in, This is the original feature map. The coordinates of the original feature map. These are the weights for bilinear interpolation; S3.2, the enlarged feature map The multi-scale feature maps C2 to C5 are fused together to obtain the multi-scale feature maps P2 to P6. S3.3, perform boundary aggregation calibration on the multi-scale feature maps P2 to P6 to obtain the boundary similarity of features at different scales, using the following formula:
[0019] in, Let be the boundary similarity between features at scale i and scale j. The bounding box for the i-th scale feature. The bounding box for the feature at scale j. For intersection, union, and comparison, The area of the bounding box. A function that takes the larger value; S3.4, based on boundary similarity Assign adaptive weights The weighted fusion of features at each scale is calculated using the following formula:
[0020] in, This is the feature map at scale i. For the adaptive weights at the i-th scale, The maximum boundary similarity at the i-th scale. To maximize the value of column index j across all possible values, The multi-scale feature maps P2 to P6 are obtained after recalibration of the feature weighted fusion.
[0021] The steps in S4 include: S4.1, 1×1 convolution is used to extract prior detail features from the recalibrated multi-scale feature maps P2 to P6. The formula is:
[0022] S4.2, prior detail features Shares the original 3×3 convolution kernel within the convolutional detector head with the lightweight version. Fusion, capturing details of enhanced convolutional kernels ,satisfy:
[0023] in, These are the prior feature weight coefficients. Calculation of local regional mean convolution kernel Spatial coordinates; S4.3, Enhance the details of the defect feature map, using the following formula:
[0024] in, For residual connection weights, For DEConv convolution operation, This is a feature map with enhanced details; S4.4, the classification branch of the detector transforms the input feature map after detail enhancement using the Sigmund function, outputting the confidence score of the defect, as shown in the formula:
[0025] in, The confidence level for predicting the c-th type defect is c = 0, 1, ..., 5. For the Sigmund function, and For the parameters of the fully connected layer in the classification branch, Feature maps with enhanced details for the classification branch; S4.5, the regression branch of the detection head transforms the input feature map after detail enhancement, and outputs the coordinate offset of the defect bounding box used for target localization, as shown in the formula:
[0026] in, This is the predicted bounding box coordinate offset. and For the parameters of the fully connected layer in the regression branch, Feature maps with enhanced details for the regression branch.
[0027] The steps in S5 include: S5.1 Construct a loss function based on the confidence level and the coordinate offset of the defective bounding box, and obtain the total loss value between the defective bounding box and the true defective bounding box. The formula is:
[0028] in, To locate the loss weight coefficients, For classifying losses, To pinpoint the loss, For confidence loss; S5.2, Obtain the shape similarity between the defect bounding box and the actual defect bounding box. The formula is:
[0029] in, The shape tolerance coefficient, For shape similarity, The aspect ratio of the defect bounding box. The aspect ratio of the true bounding box of the defect; S5.3, obtain the localization loss value between the defect bounding box and the true defect bounding box using the shape-aware intersection-over-union function. The formula is:
[0030] in, For traditional intersection and union comparison, For defective bounding boxes, For flawed true bounding boxes; S5.4 converts the similarity score into an improved localization loss value. The formula is:
[0031] This is the improved positioning loss value; S5.5 uses an optimizer to iteratively update the model parameters until the model converges, thus completing the training.
[0032] Step S6 includes: S6.1 Load the trained model, input the image to be detected into the model for processing, and generate the confidence score of the defect and the coordinate offset of the defect bounding box; S6.2, Filter by confidence level to remove prediction boxes with low confidence, retaining only those with high classification confidence. Defective bounding boxes; S6.3 uses a non-maximum suppression function based on smooth cross-union ratio to remove duplicate defective bounding boxes. The formula is as follows:
[0033] in, It is a nonmaximum suppression function. Let n be the nth defect bounding box among all candidate defect bounding boxes. This is a function that takes the maximum value based on the index of the i-th defect bounding box among all candidate defect bounding boxes; Output the defect type, bounding box coordinates, and confidence score for each valid predicted bounding box, and label them on the original image. At the same time, issue an alarm to remind workers to handle the issue.
[0034] A packaging defect detection system based on the YOLO model includes: The acquisition module acquires sample images and corresponding defect labels, performs standardization processing on the sample images, and constructs a packaging image dataset; wherein, the packaging image dataset includes at least packaging images, defect ground truth bounding boxes, and label types; The learning module constructs a packaging defect detection model based on YOLO11. It extracts edge and spatial feature maps from the packaging image and fuses them to obtain an enhanced feature map. Multi-scale feature extraction is then performed to obtain multi-scale feature maps C2–C5. Boundary similarity calculation and adaptive weight allocation are applied to the multi-scale feature maps C2–C5 to obtain recalibrated multi-scale feature maps P2–P6. Here, multi-scale feature map P2 is a newly added small defect feature map, and defects in P2–P6 are arranged from smallest to largest. Detail enhancement is performed on the recalibrated multi-scale feature maps P2–P6, and the confidence score and coordinate offset of the defect bounding box are output through classification and regression. The training module obtains the similarity between the defect bounding box and the true defect bounding box through localization loss, and iteratively trains the packaging defect detection model until it converges and training is completed. The output module inputs the packaging image to be detected into the trained packaging defect detection model and outputs the result.
[0035] A packaging defect detection device based on the YOLO model includes: a memory storing a packaging defect detection method program based on the YOLO model and a processor for running the packaging defect detection method program based on the YOLO model, wherein the packaging defect detection method program based on the YOLO model is configured to implement the steps of the packaging defect detection method based on the YOLO model.
[0036] A computer-readable storage medium storing a program for a packaging defect detection method based on the YOLO model, wherein when executed by a processor, the program implements the steps of a packaging defect detection method based on the YOLO model.
[0037] This invention, based on the YOLOv11 architecture, achieves a comprehensive improvement in detection performance, anti-interference capability, real-time performance, and cost-effectiveness compared to existing technologies through precise improvements such as "strengthening edge features in the backbone network, optimizing multi-scale fusion in the neck network, enhancing detail capture in the detection head, and improving localization accuracy using the loss function." It effectively addresses the core pain points in detecting various types of defects in edible oil packaging. Detection accuracy is significantly improved: By filtering reflective noise using the EIEStem module, strengthening small defect features using Re-CalibrationFPN, capturing subtle differences using the LSDECD module, and optimizing localization calculation using ShapeIoU, the overall defect detection accuracy is ≥93%, and the recall rate is ≥91%. It comprehensively covers the detection needs of scenarios such as label errors, abnormal liquid levels, holes, foreign objects, and damaged bottle caps. Anti-interference capability is significantly enhanced: Addressing the issues of reflectivity and texture interference in edible oil packaging, as well as label errors in the training data, this invention improves the robustness of defect detection through feature fusion and adaptive weight allocation mechanisms. Balancing real-time performance with industrial applicability: The number of model parameters increases by only 8% compared to the original YOLOv10, with a detection speed of ≥30FPS (detection time per image ≤33ms). It can be directly deployed on edible oil production lines to replace manual inspection (manual single image inspection time ≥150ms), improving detection efficiency by more than 5 times. At the same time, the positioning error of irregular defects is reduced from ≥3mm to ≤1mm, providing accurate data support for quality traceability and subsequent processing. Attached Figure Description
[0038] Other features, objects, and advantages of the present invention will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings.
[0039] Figure 1 This is a schematic diagram of the packaging defect detection method based on the YOLO model of the present invention. Detailed Implementation
[0040] To make the objectives, technical solutions, and beneficial effects of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
[0041] For ease of description, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified. In this application, unless otherwise explicitly stated and limited, the terms "installed," "connected," "linked," "fixed," etc., should be interpreted broadly. For example, they may refer to a fixed connection, a detachable connection, or an integral connection; a mechanical connection or an electrical connection; a direct connection or an indirect connection through an intermediate medium; or a connection within two components. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.
[0042] Unless otherwise specified, the terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains, and the terms should be understood to have the meaning consistent with the meaning in the context of the relevant art, and should not be interpreted in an idealized or over-formalized manner, except as expressly defined in this invention.
[0043] like Figure 1 As shown, the steps of the packaging defect detection method based on the YOLO model of this invention are as follows: Step 1: Data Preprocessing and Standardization Reflective Correction: To address the reflective issue of metal / plastic packaging for edible oils, Adaptive Histogram Equalization (CLAHE) is employed, as shown in the following formula:
[0044] in, For the input image in The grayscale value of the coordinates. and These are the minimum and maximum gray values of a local region of the image, respectively. This is a rounding function to ensure that the output grayscale value is within the range of 0-255.
[0045] Labeling system construction: Define 6 types of defect labels (label error - 0, high liquid level - 1, low liquid level - 2, hole - 3, internal foreign matter - 4, cap damage - 5), and store the bounding box coordinates of each defect in XML format. (Coordinates of the top left and bottom right corners) and label type.
[0046] Data partitioning: The training set, validation set, and test set are divided in a 7:2:1 ratio. The training set is enhanced with Mosaic (4 images are stitched together), while the validation set and test set maintain the original image size (640×640 pixels).
[0047] Step 2: Improved Backbone Network Feature Extraction: Embedding EIEStem Module to Enhance Edge Features Basic backbone network: The native C2f backbone network of YOLOv11 is selected, retaining its multi-scale feature extraction capability, and outputting feature maps at four scales: C2, C3, C4, and C5 (with 128, 256, 512, and 1024 channels, respectively).
[0048] EIEStem module embedding: An EIEStem module is inserted at the input of the backbone network (after the first convolutional layer) to specifically enhance the edge features of edible oil packaging (such as the edges of holes, the outline of the bottle cap, and the label boundary). The module structure and calculation process are as follows: Branch 1: SobelConv edge extraction branch. A 3×3 Sobel operator is used to process the input image. For edge detection, the Sobel convolution kernels in the horizontal and vertical directions are respectively... and :
[0049] Edge feature map calculation:
[0050]
[0051] in, For convolution operations, and These are the horizontal and vertical edge feature maps, respectively. This is the edge feature map after fusion.
[0052] Branch 2: Spatial Feature Extraction Branch. A 3×3 ordinary convolutional layer (kernel count = 64, stride = 1, padding = 1) is used to extract spatial detail features.
[0053] in, Initialize the convolution kernels randomly to 3×3. For activation function, This is a batch normalization operation.
[0054] Feature fusion: combining edge feature maps Spatial feature map The channels are concatenated along the channel dimension, and then compressed to 64 using a 1×1 convolution to obtain the output of the EIEStem module:
[0055] in, This is for channel splicing operations. It is a 1×1 convolutional layer (the number of channels is compressed from 128 to 64).
[0056] Backbone network forward propagation: The input is then processed by the subsequent C2f module, undergoing four downsampling and feature aggregation steps to output C2-C5 feature maps, which are used for subsequent neck network fusion.
[0057] Step 3: Improved neck network feature fusion: Re-CalibrationFPN is used to enhance small defect features. Feature pyramid basic structure: Based on the PAFPN structure of YOLOv10, a new P2 feature layer is added (for small defects: holes, bottle cap defects), forming a multi-scale feature pyramid of P2-P3-P4-P5-P6 (scales of 320×320, 160×160, 80×80, 40×40, and 20×20 pixels respectively).
[0058] The core improvement of Re-CalibrationFPN is that it recalibrates cross-scale features through the SBA (Scale-aware Boundary Aggregation) module, solving the problem of small flawed features being covered by large-scale features. The specific process is as follows: Upsampling: The C5 feature map is upsampled using bilinear interpolation (2x), concatenated with the C4 feature map, and then fused using the C2f module to obtain the P4 feature map; similarly, the P3 and P2 feature maps are generated sequentially. The upsampling formula is:
[0059] in, This is the feature map after upsampling. This is the original feature map. The coordinates of the original feature map. These are the weights for bilinear interpolation.
[0060] Scale calibration: Boundary aggregation calibration is performed on the feature maps of P2 (small defects) and P6 (large defects), and the boundary similarity of features at different scales is calculated.
[0061] in, Let be the boundary similarity between features at scale i and scale j. The bounding box for the i-th scale feature. For intersection, union, and comparison, This represents the area of the bounding box.
[0062] Feature-weighted fusion: based on boundary similarity Assign adaptive weights Weighted fusion of features at each scale:
[0063] in, This is the feature map at scale i. For the adaptive weights at the i-th scale, Let be the maximum boundary similarity at the i-th scale.
[0064] Output features: The fused P2-P6 feature maps are input into the detection head, where the P2 feature map is specifically used for the detection of small defects (holes, bottle cap defects), P3-P4 is used for medium defects (label errors, internal foreign objects), and P5-P6 is used for large-scale defects (abnormal liquid level).
[0065] Step 4: Improved Detection Head Prediction: Enhanced Detail Capture by Embedding LSDECD Module The detection head basic structure adopts the YOLOv11 detection head architecture, which includes classification and regression branches, both of which consist of 3 convolutional layers (3×3 convolutional kernels, ReLU activation function).
[0066] LSDECD module embedding: An LSDECD module is inserted after the second convolutional layer in both the classification and regression branches. This enhances the detail features of subtle imperfections (such as liquid level differences and label misalignment) through detail-enhanced convolution (DEConv). The specific process is as follows: Prior detail feature extraction: on the input feature map Prior detail features are extracted using 1×1 convolution. :
[0067] DEConv convolution kernel construction: Incorporating prior features With the original 3×3 convolution kernel Fusion, constructing detailed enhancement convolution kernels :
[0068] in, These are the prior feature weight coefficients. Calculate the mean for a local area to ensure While preserving the original convolution kernel characteristics, it incorporates detailed priors.
[0069] Detail enhancement feature calculation:
[0070] in, For residual connection weights, For DEConv convolution operation, This is a feature map with enhanced details.
[0071] Prediction output: The classification branch outputs the confidence scores for the six types of defects using the sigmoid activation function, and the regression branch outputs the bounding box coordinate offsets, as shown in the following formula: Classification prediction:
[0072] in, Let c be the confidence level for the c-th type of defect (c=0,1,...,5). For the Sigmoid function, and These are the parameters for the fully connected layer in the classification branch.
[0073] Regression Prediction:
[0074] in, This is the bounding box coordinate offset. and These are the parameters for the fully connected layer in the regression branch.
[0075] Step 5: Loss Function Improvement: Optimize Localization Loss using ShapeIoU The loss function consists of: the total loss including classification loss. Location loss and confidence loss ,formula:
[0076] in, The location loss weight coefficients.
[0077] ShapeIoU Positioning Loss Improvement: For irregular shapes of defects in edible oil packaging (such as holes, damaged caps), ShapeIoU is used instead of traditional IoU to improve positioning accuracy. ShapeIoU calculation process: Bounding box shape parameter extraction: Defect bounding box and flawed true bounding box Extract aspect ratio , .
[0078] Shape similarity calculation:
[0079] in, The shape tolerance coefficient, Shape similarity (range 0-1).
[0080] ShapeIoU calculation:
[0081] in, This is a traditional intersection-merge comparison.
[0082] Location loss:
[0083] Model training: The AdamW optimizer (learning rate = 0.001, weight decay = 0.0005) was used to train for 100 epochs. The first 10 epochs used a warm-up learning rate (linearly increasing from 0.0001 to 0.001), and the last 90 epochs used a cosine annealing learning rate decay.
[0084] Step 6: Post-processing and result output Confidence Filtering: Retain Classification Confidence The predicted bounding box is filtered to remove low-confidence noise.
[0085] Non-maximum suppression (NMS): SIoU-NMS is used to remove duplicate predicted boxes, with a threshold set to 0.3. The formula is:
[0086] Output results: Output the defect type and bounding box coordinates for each valid predicted bounding box. Confidence level The information is then marked on the original image, and an alarm is triggered to alert workers to take action.
[0087] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
[0088] The above description of the embodiments is provided to enable those skilled in the art to understand and use the present invention. It will be apparent to those skilled in the art that various modifications can be made to these embodiments, and the general principles described herein can be applied to other embodiments without inventive effort. Therefore, the present invention is not limited to the above embodiments, and any improvements and modifications made by those skilled in the art based on the disclosure of the present invention without departing from the scope of the invention should be within the protection scope of the present invention.
Claims
1. A packaging defect detection method based on the YOLO model, characterized in that, include: S1, acquire sample images and corresponding defect labels, standardize the sample images, and construct a packaging image dataset; wherein, the packaging image dataset includes at least packaging images, defect ground truth bounding boxes, and label types; S2, construct a packaging defect detection model based on YOLO11, extract the edge feature map and spatial feature map of the packaging image, fuse the edge feature map and spatial feature map to obtain an enhanced feature map, and perform multi-scale feature extraction to obtain multi-scale feature maps C2 to C5; S3, perform boundary similarity calculation and adaptive weight allocation on the multi-scale feature maps C2 to C5 to obtain recalibrated multi-scale feature maps P2 to P6; wherein, the multi-scale feature map P2 is a newly added small defect feature map, and the defects in P2 to P6 are arranged from small to large. S4, perform detail enhancement on the recalibrated multi-scale feature maps P2 to P6, and output the confidence level of the defect and the coordinate offset of the defect bounding box through classification and regression. S5, obtain the similarity between the defect bounding box and the true defect bounding box through localization loss, and iteratively train the packaging defect detection model until convergence and training is completed; S6, input the package image to be detected into the trained package defect detection model and output the result.
2. The packaging defect detection method based on the YOLO model according to claim 1, characterized in that, The steps in S1 include: Adaptive histogram equalization is used to correct the reflection in the packaging image, and the corrected packaging image is output. The correction formula is as follows: in, For the input image in The grayscale value of the coordinates. and These are the minimum and maximum gray values of a local region of the image, respectively. This is a rounding function. To output the image in The grayscale value of the coordinates; Construct a defect labeling system, storing the actual bounding box coordinates and label type of each defect in XML format; The packaged image dataset is divided into a training set, a validation set, and a test set, and data augmentation is performed on the training set images; wherein... The label types include at least: label error, high liquid level, low liquid level, hole, internal foreign matter, and damaged bottle cap.
3. The packaging defect detection method based on the YOLO model according to claim 2, characterized in that, Step S21 includes: S2.1, Insert an edge information enhancement front-end module at the input end of the backbone network; S2.2, edge detection is performed on the input packaging image using the Sobel operator, with the horizontal and vertical Sobel convolution kernels being respectively... and ; S2.3, Calculate the edge feature map of the packaging image. The formula is: in, For convolution operations, and These are the horizontal and vertical edge feature maps, respectively. S2.4, Extract the spatial features of the packaging image to form the spatial feature map. The formula is: in, Initialize the convolution kernels randomly to 3×3. For activation function, For batch normalization operation; S2.5, the edge feature map With the spatial feature map The data is concatenated along the channel dimension, and a fused feature map is output. The formula is: ,in, This is for channel splicing operations. It is a 1×1 convolutional layer; S2.6, the fused feature map The input core feature extraction module performs four downsampling and feature aggregation operations to output the multi-scale feature maps C2 to C5.
4. The packaging defect detection method based on the YOLO model according to claim 3, characterized in that, The steps in S3 include: S3.1, Upsample the multi-scale feature maps C2 to C5 to obtain magnified feature maps. The formula is: in, This is the original feature map. The coordinates of the original feature map. These are the weights for bilinear interpolation; S3.2, the enlarged feature map The multi-scale feature maps C2 to C5 are fused together to obtain the multi-scale feature maps P2 to P6. S3.3, perform boundary aggregation calibration on the multi-scale feature maps P2 to P6 to obtain the boundary similarity of features at different scales, using the following formula: in, Let be the boundary similarity between features at scale i and scale j. The bounding box for the i-th scale feature. The bounding box for the feature at scale j. For intersection, union, and comparison, The area of the bounding box. A function that takes the larger value; S3.4, based on boundary similarity Assign adaptive weights The weighted fusion of features at each scale is calculated using the following formula: in, This is the feature map at scale i. For the adaptive weights at the i-th scale, The maximum boundary similarity at the i-th scale. To maximize the value of column index j across all possible values, The multi-scale feature maps P2 to P6 are obtained after recalibration of the feature weighted fusion.
5. The packaging defect detection method based on the YOLO model according to claim 4, characterized in that, The steps in S4 include: S4.1, 1×1 convolution is used to extract prior detail features from the recalibrated multi-scale feature maps P2 to P6. The formula is: S4.2, prior detail features Shares the original 3×3 convolution kernel within the convolutional detector head with the lightweight version. Fusion, capturing details of enhanced convolutional kernels ,satisfy: in, These are the prior feature weight coefficients. Calculation of the mean for a local area. convolution kernel Spatial coordinates; S4.3, Enhance the details of the defect feature map, using the following formula: in, For residual connection weights, For DEConv convolution operation, This is a feature map with enhanced details; S4.4, the classification branch of the detector transforms the input feature map after detail enhancement using the Sigmund function, outputting the confidence score of the defect, as shown in the formula: in, The confidence level for predicting the c-th type defect is c = 0, 1, ..., 5. For the Sigmund function, and For the parameters of the fully connected layer in the classification branch, Feature maps with enhanced details for the classification branch; S4.5, the regression branch of the detection head transforms the input feature map after detail enhancement, and outputs the coordinate offset of the defect bounding box used for target localization, as shown in the formula: in, This is the predicted bounding box coordinate offset. and For the parameters of the fully connected layer in the regression branch, Feature maps with enhanced details for the regression branch.
6. The packaging defect detection method based on the YOLO model according to claim 5, characterized in that, The steps in S5 include: S5.1, Construct a loss function based on the confidence level and the coordinate offset of the defective bounding box, and obtain the total loss value between the defective bounding box and the true defective bounding box. The formula is: in, To locate the loss weight coefficients, For classifying losses, To pinpoint the loss, For confidence loss; S5.2, Obtain the shape similarity between the defect bounding box and the actual defect bounding box. The formula is: in, The shape tolerance coefficient, For shape similarity, The aspect ratio of the defect bounding box. The aspect ratio of the true bounding box of the defect; S5.3, obtain the localization loss value between the defect bounding box and the true defect bounding box using the shape-aware intersection-over-union function. The formula is: in, For traditional intersection and union comparison, For defective bounding boxes, For flawed true bounding boxes; S5.4 converts the similarity score into an improved localization loss value. The formula is: This is the improved positioning loss value; S5.5 uses an optimizer to iteratively update the model parameters until the model converges, thus completing the training.
7. The packaging defect detection method based on the YOLO model according to claim 6, characterized in that, Step S6 includes: S6.1 Load the trained model, input the image to be detected into the model for processing, and generate the confidence score of the defect and the coordinate offset of the defect bounding box; S6.2, Filter by confidence level to remove prediction boxes with low confidence, retaining only those with high classification confidence. Defective bounding boxes; S6.3 uses a non-maximum suppression function based on smooth cross-union ratio to remove duplicate defective bounding boxes. The formula is as follows: in, It is a nonmaximum suppression function. Let n be the nth defect bounding box among all candidate defect bounding boxes. This is a function that takes the maximum value based on the index of the i-th defect bounding box among all candidate defect bounding boxes; Output the defect type, bounding box coordinates, and confidence score for each valid predicted bounding box, and label them on the original image. At the same time, issue an alarm to remind workers to handle the issue.
8. A packaging defect detection system based on the YOLO model, characterized in that, include: The acquisition module acquires sample images and corresponding defect labels, performs standardization processing on the sample images, and constructs a packaging image dataset; wherein, the packaging image dataset includes at least packaging images, defect ground truth bounding boxes, and label types; The learning module constructs a packaging defect detection model based on YOLO11. It extracts edge and spatial feature maps from the packaging image and fuses them to obtain an enhanced feature map. Multi-scale feature extraction is then performed to obtain multi-scale feature maps C2–C5. Boundary similarity calculation and adaptive weight allocation are applied to the multi-scale feature maps C2–C5 to obtain recalibrated multi-scale feature maps P2–P6. Here, multi-scale feature map P2 is a newly added small defect feature map, and defects in P2–P6 are arranged from smallest to largest. Detail enhancement is performed on the recalibrated multi-scale feature maps P2–P6, and the confidence score and coordinate offset of the defect bounding box are output through classification and regression. The training module obtains the similarity between the defect bounding box and the true defect bounding box through localization loss, and iteratively trains the packaging defect detection model until it converges and training is completed. The output module inputs the packaging image to be detected into the trained packaging defect detection model and outputs the result.
9. A packaging defect detection device based on the YOLO model, characterized in that, include: The package includes a memory storing a program for a packaging defect detection method based on the YOLO model and a processor for running the program, the program being configured to implement the steps of the packaging defect detection method based on the YOLO model as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, A computer-readable storage medium stores a packaging defect detection method program based on the YOLO model, which, when executed by a processor, implements the steps of the packaging defect detection method based on the YOLO model as described in any one of claims 1 to 7.
Citation Information
Patent Citations
Edible oil impurity visual detection system and use method thereof
CN116500052A
Food quality detection method and device based on machine learning
CN117831026A
Online edible oil bottle cap detection machine
CN220181955U