Industrial part image defect detection method and system, electronic device and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a balanced defect sample dataset and an improved Mask RCNN model, the problems of missed detection and false detection in the surface defect detection of industrial parts were solved, and high-precision defect detection was achieved.

CN116012291BActive Publication Date: 2026-06-26NANJING TECH UNIV

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: NANJING TECH UNIV
Filing Date: 2022-11-21
Publication Date: 2026-06-26

Application Information

Patent Timeline

21 Nov 2022

Application

26 Jun 2026

Publication

CN116012291B

IPC: G06T7/00; G06V10/77; G06V10/80; G06V10/82; G06N3/08; G06V10/774; G06N3/0464

CPC: Y02P90/30

AI Tagging

Technology Topics

Pattern recognition Data set

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Joint expression coding system and method based on static and dynamic expression images
US12664819B2Character and pattern recognition Pattern recognition Image generation
A high-precision visual displacement measurement space-time combined error correction method
CN122258834ASystematic error suppressionStable Displacement MeasurementImage analysis Character and pattern recognition Pattern recognition Engineering
Apparatus and method for building an object database for training an artificial intelligence model
US20260170810A1Character and pattern recognition Pattern recognition Data set
A 3D human pose estimation method, device and storage medium
CN122244960ACharacter and pattern recognition Biological models Pattern recognition Human body
A three-dimensional gesture tracking method based on an RGB camera
CN115810219BSimple structure improve accuracy Character and pattern recognition Biological models Pattern recognition Computer graphics (images)

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies suffer from high rates of missed and false detections in the detection of surface defects in industrial parts. Furthermore, deep learning-based detection methods are not ideal for detecting extremely small targets, and the limited datasets and inappropriate anchor frame design lead to low detection accuracy.

Method used

We construct a balanced defect sample dataset using SSD data augmentation and semi-supervised data augmentation. We combine a deep learning model with a feature extraction layer, a feature fusion layer, and a defect recognition layer. We improve detection accuracy through a feature pyramid network and an attention mechanism. Finally, we use an improved Mask RCNN model for defect detection.

Benefits of technology

It effectively reduced the false negative and false positive rates, improved the accuracy of surface defect detection for industrial parts, enhanced the model's judgment and reasoning capabilities, and improved the detection accuracy for extremely small targets.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116012291B_ABST

Patent Text Reader

Abstract

The application provides an industrial part image defect detection method and system, an electronic device and a storage medium, wherein the method comprises the following steps: based on a plurality of part surface defect images, determining a part defect sample data set according to an SSD data enhancement method and a semi-supervised data enhancement method; wherein the number of each defect category in the part defect sample data set is balanced; training a defect detection model according to the part defect sample data set to determine a target defect detection model; inputting a target part image to be tested into the target defect detection model to determine the defect category and position information of the target part to be tested. The accuracy of the industrial part surface defect detection can be effectively improved, and the missed detection rate and the false detection rate can be reduced.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of machine vision technology, and in particular to a method and system for detecting defects in images of industrial parts, an electronic device, and a storage medium. Background Technology

[0002] With the development of intelligent manufacturing, the demand for industrial parts is increasing, and the processing of industrial parts has basically achieved full automation in mechanical production. In machining, especially in the manufacturing of core components for aerospace parts, the surface quality requirements for industrial parts are high, typically requiring that their surfaces be free of pits or cracks wider than 1mm. However, in actual processing and production, due to problems with processing equipment, human factors during the processing, and environmental factors, various defects inevitably occur, such as scratches, pits, cracks, peeling, and spots on the surface of steel plates.

[0003] Traditional industrial parts surface defect detection is mainly done manually, relying on skilled technicians to visually inspect the parts to determine if defects exist. However, due to fatigue and decreased concentration, this can easily lead to numerous false positives and missed positives, wasting financial, material, and human resources.

[0004] Detection methods based on image processing or shallow machine learning techniques can only detect defects under specific conditions, such as at a certain scale or under specific lighting conditions, detecting obvious defect contours with high contrast and low noise. Deep learning-based detection methods, due to issues such as extremely small detection targets, limited datasets, and inappropriate anchor box design, exhibit irrational detection performance and low accuracy.

[0005] Therefore, how to provide a method, system, electronic device, and storage medium for detecting defects in industrial parts images to effectively improve the accuracy of surface defect detection and reduce the false negative and false positive rates has become an urgent problem to be solved. Summary of the Invention

[0006] To address the shortcomings of existing technologies, embodiments of the present invention provide a method and system for detecting defects in images of industrial parts, an electronic device, and a storage medium.

[0007] This invention provides a method for detecting defects in images of industrial parts, comprising:

[0008] Based on several surface defect images of parts, a defect sample dataset is determined using SSD data augmentation and semi-supervised data augmentation methods; the number of each defect category is balanced in the defect sample dataset.

[0009] A defect detection model is trained based on a dataset of defect samples from parts, and a target defect detection model is determined.

[0010] The image of the target part to be tested is input into the target defect detection model to determine the defect category and location information of the target part to be tested.

[0011] The industrial part image defect detection method provided by the present invention determines a part defect sample dataset based on several part surface defect images, using SSD data augmentation and semi-supervised data augmentation, specifically including:

[0012] Based on surface defect images of several parts, a label-extended dataset is determined according to the optimized SSD data augmentation method; wherein, the optimized SSD data augmentation method incorporates the Mosaic algorithm into the original SSD data augmentation method.

[0013] A semi-supervised data augmentation method was used to process the label-expanded dataset and determine the sample dataset of part defects.

[0014] According to the industrial part image defect detection method provided by the present invention, the defect detection model includes: a feature extraction layer, a feature fusion layer, and a defect recognition layer; the feature extraction layer includes multiple sub-feature extraction layers; the sub-feature extraction layers are connected sequentially, and the output of the previous sub-feature extraction layer is the input of the next sub-feature extraction layer;

[0015] Correspondingly, the image of the target part to be tested is input into the target defect detection model to determine the defect category and location information of the target part to be tested, specifically including:

[0016] The image of the target part to be tested is input into the feature extraction layer of the target defect detection model;

[0017] Based on the feature extraction layer, the feature information of the target part image is extracted, and the target part feature map output by each sub-feature extraction layer is obtained.

[0018] Based on the feature fusion layer, the part feature maps output by each sub-feature extraction layer are fused according to the feature pyramid network to determine the target fused feature map.

[0019] Based on the defect identification layer, the defect category and location information of the target defect are determined in the image of the target part to be tested according to the target fusion feature map.

[0020] The industrial part image defect detection method provided by the present invention further includes an initial feature extraction layer; the initial feature extraction layer is disposed before all sub-feature extraction layers.

[0021] Correspondingly, based on the feature extraction layer, the feature information of the target part image is extracted, and the target part feature map output by each sub-feature extraction layer is obtained, specifically including:

[0022] Based on the initial feature extraction layer, feature information of the target part image is extracted to determine the initial part feature map;

[0023] Based on the first sub-feature extraction layer, the initial part feature map is transformed according to the channel attention mechanism and the spatial attention mechanism to determine the first target part feature map;

[0024] Input the feature map of the first target part into the second sub-feature extraction layer;

[0025] Repeat the above steps of transforming according to the channel attention mechanism and spatial attention mechanism to determine the target part feature map, and input the target part feature map into the next sub-feature extraction layer, until the target part feature map output by each sub-feature extraction layer is obtained.

[0026] The industrial part image defect detection method provided by the present invention, based on a feature fusion layer, fuses the part feature maps output by each sub-feature extraction layer according to a feature pyramid network to determine the target fused feature map, specifically including:

[0027] Based on the feature fusion layer, according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused according to the preset fusion rules to determine multiple fused feature maps at different levels.

[0028] The local response normalization process is used to process multiple fusion feature maps at different levels. The processed fusion feature maps are then fused to determine the target fusion feature map.

[0029] According to the industrial part image defect detection method provided by the present invention, the defect recognition layer includes: a bounding box determination layer and a defect recognition layer;

[0030] Correspondingly, based on the defect recognition layer, the defect category and location information of the target defect are determined in the target part image according to the target fusion feature map, specifically including:

[0031] Based on the bounding box determination layer, the bounding box of the target defect is determined by the RoIAlign region feature aggregation algorithm according to the target fusion feature map, and the location information of the target defect is obtained.

[0032] Based on the feature information of the target defect bounding box region, the target defect is classified to determine the defect category.

[0033] The industrial part image defect detection method provided by the present invention trains a defect detection model based on a part defect sample dataset and determines a target defect detection model, specifically including:

[0034] A defect detection model is trained based on a dataset of defective parts samples.

[0035] Based on the target loss function, the network parameters of the defect detection model are updated according to the AdaGrad algorithm, and the defect detection model is iteratively trained based on the updated network parameters until the defect detection model converges, thus determining the target defect detection model.

[0036] The formula for the target loss function is as follows:

[0037]

[0038] In the formula, p is the set of positive samples; l RS (i) represents the sum of the current rank error and the current sort error; This is the sum of the target rank error and the target sort error.

[0039] The present invention also provides an industrial parts image defect detection system, comprising: a sample determination unit, a model determination unit, and a defect detection unit;

[0040] The sample determination unit is used to determine the part defect sample dataset based on several part surface defect images, according to the SSD data augmentation method and the semi-supervised data augmentation method; wherein the number of each defect category in the part defect sample dataset is balanced.

[0041] The model determination unit is used to train a defect detection model based on a part defect sample dataset and determine the target defect detection model.

[0042] The defect detection unit is used to input the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested.

[0043] The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of any of the above-described methods for detecting defects in images of industrial parts.

[0044] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of any of the above-described industrial part image defect detection methods.

[0045] The present invention provides an industrial parts image defect detection method, system, electronic device, and storage medium. By expanding the sample data using SSD data augmentation and semi-supervised data augmentation, a sample dataset of parts defects is determined, ensuring a balanced number of defect category samples in the dataset. This effectively solves the problem of poor model training performance caused by low sample size and extreme imbalance between positive and negative samples, effectively preventing overfitting during model training and enhancing the model's judgment and reasoning capabilities. The trained target defect detection model is used to detect images of target parts, determining the defect category and location information, effectively improving the accuracy of surface defect detection in industrial parts and reducing the false negative and false positive rates. Attached Figure Description

[0046] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0047] Figure 1 This is a flowchart of the industrial parts image defect detection method provided by the present invention;

[0048] Figure 2 This is a schematic diagram of the industrial part defect detection model provided by the present invention;

[0049] Figure 3 The feature extraction network structure diagram provided by this invention;

[0050] Figure 4 A schematic diagram of the residual structure of the hollow BottleNeck provided by the present invention;

[0051] Figure 5 A schematic diagram of the convolutional block attention module for hybrid domains provided by the present invention;

[0052] Figure 6 This is a schematic diagram of the structure of the industrial parts image defect detection system provided by the present invention;

[0053] Figure 7 A schematic diagram of the physical structure of the electronic device provided by the present invention. Detailed Implementation

[0054] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0055] With the continuous development of deep learning, more and more deep learning networks are being used to solve the problem of defect detection and classification of industrial parts images, as well as the problem of limited datasets of actual industrial defect parts images.

[0056] However, the following problems still exist:

[0057] Insufficient learning ability of the network framework can easily lead to problems such as low model accuracy and overfitting.

[0058] Inappropriate anchor frame design leads to inaccurate positioning and missed detections;

[0059] Defects on the surface of parts are often very small, and the detection effect on defects in extremely small targets is not ideal.

[0060] Therefore, in view of the shortcomings of the prior art, the present invention provides an image defect detection method for industrial parts, thereby solving the problem of detecting defects in industrial parts with small amounts of data that traditional vision cannot solve, and can significantly improve the location accuracy and category accuracy of surface defect detection in industrial parts.

[0061] Figure 1 The flowchart of the industrial part image defect detection method provided by the present invention is as follows: Figure 1 As shown, the present invention provides a method for detecting defects in images of industrial parts, comprising:

[0062] Step S1: Based on several surface defect images of parts, determine the part defect sample dataset according to the SSD data augmentation method and the semi-supervised data augmentation method; wherein, the number of each defect category in the part defect sample dataset is balanced.

[0063] Step S2: Train a defect detection model based on the part defect sample dataset and determine the target defect detection model;

[0064] Step S3: Input the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested.

[0065] Specifically, before training the defect detection model, it is necessary to collect images of surface defects on the parts and construct a sample dataset of part defects.

[0066] Taking a metal commutator as an example, the steps for acquiring surface defect images of the part are explained. A CCD high-definition digital camera (16 million pixels, 60fps acquisition frame rate, and 1080P acquisition resolution) is used in an industrial production workshop to acquire various defect images of the metal commutator surface under different conditions and magnifications. The images are stored according to the naming method of the defect array, and a set of images for each type of defect is constructed. The images are grouped, feature points are extracted from each image, and each pixel is segmented to finally obtain several high-precision surface defect images of the part.

[0067] It is understood that the above-mentioned types of industrial parts and specific methods for obtaining images of surface defects are only used as specific examples to illustrate the present invention. In the actual application of the present invention, the types of industrial parts and the methods for obtaining images can be adjusted according to actual needs, and the present invention does not limit them.

[0068] After acquiring several surface defect images of parts, the images need to be preprocessed and expanded. In step S1, based on several surface defect images of parts, the images are expanded using SSD (Single Shot Multibox Detector) data augmentation method and semi-supervised data augmentation method to balance the number of each defect category. The images in the augmented dataset are then labeled to determine the part defect sample dataset.

[0069] SSD data augmentation performs optical transformations (randomly adjusting brightness, contrast, hue, saturation, and channels) and geometric transformations (random expansion, cropping, mirroring, and scaling to a fixed ratio) on the input image data, followed by mean removal. This effectively increases the diversity of scale samples, improves the network's robustness to target scales, and enhances the detection accuracy of small targets and the detection performance of occluded objects.

[0070] Semi-supervised data augmentation analyzes information such as the aspect ratio and area of the defect annotation boxes in each image of the data sample. This preserves the original label information, enhances the diversity of the data, effectively prevents overfitting, and improves the model's judgment and reasoning ability.

[0071] In practical applications, the open-source LabelImg software can be compiled and generated under the Windows 10 64-bit operating system environment. This software can be used to manually annotate defects, ensuring that each defect is centered within the annotation box. After annotation, a txt or xml file is saved, containing the center coordinates of the defect image and its relative width and height, thus annotating the augmented dataset. In addition, other annotation methods can be selected according to the actual situation; this invention does not limit these methods.

[0072] It should be noted that the specific steps of data augmentation achieved by the SSD data augmentation method and semi-supervised data augmentation method in this invention, as well as the specific methods for balancing sample categories (such as setting the augmentation weight according to the number of sample categories during data expansion, or expanding samples of different categories to a preset number, etc.), can be adjusted according to the actual situation, and this invention does not limit them.

[0073] After determining the part defect sample dataset, in step S2, a defect detection model is trained based on the part defect sample dataset to determine the target defect detection model.

[0074] It is understandable that the model structure needs to be determined before model training. The specific structure and training method of the defect detection model can be set according to actual needs, and this invention does not limit this.

[0075] After obtaining the trained target defect detection model, in step S3, the image of the target part to be tested is input into the target defect detection model to detect relevant information of industrial part defects and determine the defect category and location information of the target part to be tested.

[0076] It is understood that the specific classification of part defect categories in this invention can be set according to actual needs, and this invention does not limit this.

[0077] The industrial part image defect detection method provided by this invention expands the sample data using SSD data augmentation and semi-supervised data augmentation methods to determine the part defect sample dataset. This ensures a balanced number of defect category samples in the dataset, effectively solving the problem of poor model training performance caused by low sample size and extreme imbalance between positive and negative samples. It also effectively prevents overfitting during model training and enhances the model's judgment and reasoning capabilities. The trained target defect detection model is then used to detect the target part image, determining the defect category and location information, effectively improving the accuracy of surface defect detection in industrial parts and reducing the false negative and false positive rates.

[0078] Optionally, according to the industrial part image defect detection method provided by the present invention, based on several part surface defect images, a part defect sample dataset is determined according to SSD data augmentation and semi-supervised data augmentation, specifically including:

[0079] Based on surface defect images of several parts, a label-extended dataset is determined according to the optimized SSD data augmentation method; wherein, the optimized SSD data augmentation method incorporates the Mosaic algorithm into the original SSD data augmentation method.

[0080] A semi-supervised data augmentation method was used to process the label-expanded dataset and determine the sample dataset of part defects.

[0081] Specifically, to address the problem of imbalanced samples, this invention employs an optimized SSD data augmentation method to initially expand a dataset of surface defect images of several parts, thereby obtaining a labeled extended dataset.

[0082] The optimized SSD data augmentation method incorporates the Mosaic algorithm into the original SSD data augmentation method. Optical transformation can adjust the size of image pixel values without changing the image size, while geometric transformation is mainly responsible for scale changes. Then, the mean is removed. By incorporating the Mosaic method, four images are randomly cropped and stitched together into one image for training. This enriches the image background and indirectly increases the batch size, which can not only speed up the training process but also improve accuracy.

[0083] Most operations are random processes, ensuring data richness, effectively preventing overfitting, and improving the model's judgment and reasoning abilities.

[0084] Then, a semi-supervised data augmentation method was used to expand the label extended dataset a second time to determine the part defect sample dataset.

[0085] Taking a metal commutator as an example, the method of determining a sample dataset of defective parts based on data augmentation according to the present invention will be described.

[0086] For example, in a real industrial environment, all components are inspected by expert personnel beforehand and labeled with defect areas and their categories. Because defective parts are very rare in actual industrial production, the number of positive and negative samples is extremely unbalanced. Several images of surface defects on metal commutators were collected, labeled, and compiled into a dataset. This dataset includes 399 images, of which 52 have obvious defects and 347 have no defects. The original images are 500 pixels wide and range in height from 1240 to 1270 pixels.

[0087] To facilitate the detection algorithm, the original images are preprocessed by using polar coordinate-Cartesian transformation to convert all images into rectangles. To ensure the quality of the generated images, bilinear interpolation is used to obtain appropriate pixel values. A sliding window is then used to crop the rectangular industrial images into smaller images suitable for CNN training and prediction. Next, the cropped images are selected and labeled, and the dataset is initially augmented using an optimized SSD data augmentation method to obtain a labeled expanded dataset. Further, a semi-supervised data augmentation method is used to augment the labeled expanded dataset a second time, determining the part defect sample dataset.

[0088] Different background colors and lighting have little impact on defect detection. Therefore, to reduce the influence of color and illumination, the images are uniformly binarized, and all grayscale images of defect areas are adjusted to 227×227 for unified input. The CNN network used contains 5 convolutional layers and 3 max-pooling layers. Each convolutional layer is followed by a rectified linear unit. A batch normalization layer is added after the first two convolutional layers to speed up the training process. This algorithm can crop the data for each channel with zero mean and zero unit variance.

[0089] Furthermore, since the captured image of a metal surface typically contains more background pixels than defect pixels, to train the network, the imbalanced classes are reweighted (e.g., by setting wdefects = 0.8 and wbackground = 0.2 in the loss function) to emphasize defects and weaken the background. A weighted w... k The improved smart pixel cross-entropy loss function is defined as follows:

[0090]

[0091] In the formula, w k Here, K=2 represents the number of classes (background and defects), M represents the size of the mini-batch training samples, N is the number of pixels in each image patch, and 1 (y=K) is an indicator function that takes the value 1 when y=K and 0 otherwise. It is the j-th pixel image in the i-th image. yes The real labels, and The probability of a pixel is There are K categories, and it will be the output of the softmax layer.

[0092] It is understood that when determining the sample dataset of part defects in this invention, the sample data can be preprocessed using the above loss function to weight the background and defects in the image, strengthen the defects, weaken the background, improve the ability to extract features during model training, accelerate network training, and improve model accuracy. This can be considered a preferred solution of this invention.

[0093] The industrial parts image defect detection method provided by this invention first expands the data using an optimized SSD data augmentation method to balance the sample data, and then uses a semi-supervised data augmentation method to further expand the sample data, increasing the diversity of the sample data. This two-step data augmentation method ensures a balanced number of defect category samples in the dataset, effectively solving the problem of poor model training performance caused by low sample size and extreme imbalance between positive and negative samples. It achieves relatively high classification performance with a small amount of training data, effectively preventing overfitting during model training and enhancing the model's judgment and reasoning abilities.

[0094] Optionally, according to the industrial part image defect detection method provided by the present invention, the defect detection model includes: a feature extraction layer, a feature fusion layer, and a defect recognition layer; the feature extraction layer includes multiple sub-feature extraction layers; the sub-feature extraction layers are connected sequentially, and the output of the previous sub-feature extraction layer is the input of the next sub-feature extraction layer;

[0095] Correspondingly, the image of the target part to be tested is input into the target defect detection model to determine the defect category and location information of the target part to be tested, specifically including:

[0096] The image of the target part to be tested is input into the feature extraction layer of the target defect detection model;

[0097] Based on the feature extraction layer, the feature information of the target part image is extracted, and the target part feature map output by each sub-feature extraction layer is obtained.

[0098] Based on the feature fusion layer, the part feature maps output by each sub-feature extraction layer are fused according to the feature pyramid network to determine the target fused feature map.

[0099] Based on the defect identification layer, the defect category and location information of the target defect are determined in the image of the target part to be tested according to the target fusion feature map.

[0100] Specifically, Figure 2 This is a schematic diagram of the industrial part defect detection model provided by the present invention, as shown below. Figure 2 As shown, the defect detection model includes a feature extraction layer, a feature fusion layer, and a defect recognition layer; the feature extraction layer comprises multiple sub-feature extraction layers. Each sub-feature extraction layer outputs a feature map, and the sub-feature extraction layers are connected sequentially, with the output of the previous sub-feature extraction layer serving as the input of the next sub-feature extraction layer.

[0101] Understandably, the specific number of sub-feature extraction layers in the feature extraction layer can be set according to actual needs.

[0102] Using the improved Mask RCNN object detection model, the ResNet101 backbone network model with fewer parameters and better performance is selected as the feature extraction network of the model. A feature pyramid structure is designed for feature fusion, and a model structure with dilated convolution is introduced as an example. The specific steps of inputting the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested are explained in this invention.

[0103] The ResNet101 network model (feature extraction layer) consists of 100 convolutional layers and one fully connected layer. The image scale is changed by the convolution stride. All the traditional 3x3 convolutions in the last three stages of ResNet101 are replaced with dilated convolutions.

[0104] In real-world industrial inspection environments, defects vary widely in size and shape. Different receptive field sizes are used to accommodate this. Dilated convolutions are employed to increase the network's receptive field, enabling the detection of large defects and improving inspection performance.

[0105] Figure 3 The feature extraction network structure diagram provided by this invention is as follows: Figure 3 As shown, feature extraction is divided into five sub-feature extraction layers. The image is downsampled by 2 times after each stage. A total of five feature maps of the target part are output after the five stages. The size of the target part feature maps is 1 / 2, 1 / 4, 1 / 8, 1 / 16 and 1 / 32 of the original image, respectively.

[0106] The first sub-feature extraction layer consists of a 7x7 convolution. The other four sub-feature extraction layers introduce 3, 4, 23, and 3 residual structures composed of 1x1, 3x3, and 1x1 convolutions, respectively, and set the convolution-related parameters in the residual blocks.

[0107] Figure 4 This is a schematic diagram of the residual structure of the hollow BottleNeck provided by the present invention, as shown below. Figure 4 As shown, one is a 3×3 convolution with 2 dilations (Dip BottleNeck A), and the other is based on Dip BottleNeck A, with an additional 1×1 convolution added to the identity mapping part (Dip BottleNeck B). Introducing residual structures can effectively avoid the problems of gradient vanishing and degradation in deep networks.

[0108] The feature pyramid structure fuses the five target part feature maps output by ResNet101 to determine the target fusion feature map, achieving complementary advantages between feature maps, enriching feature details, and effectively improving the detection effect of small targets.

[0109] Based on the defect recognition layer, relevant information about defects in industrial parts is detected according to the target fusion feature map, and the defect category and location information of the target defect are determined in the image of the target part to be tested.

[0110] It is understood that the specific structure of the above model is only a specific example of the present invention. In addition, the model structure can be adjusted according to actual needs, and the present invention does not limit it.

[0111] The industrial part image defect detection method provided by this invention extracts image features at different levels by setting up a feature extraction layer with multiple sub-feature extraction layers linked sequentially, thereby obtaining multiple feature maps of the target part. The multi-layer feature maps are beneficial for detecting objects at multiple scales and small objects. By fusing multiple target part feature maps according to a feature pyramid network, the advantages of the feature maps can be complemented, resulting in richer feature details and effectively improving the detection effect when detecting information related to defects in industrial parts.

[0112] Optionally, in the industrial part image defect detection method provided by the present invention, the feature extraction layer further includes: an initial feature extraction layer; the initial feature extraction layer is disposed before all sub-feature extraction layers;

[0113] Correspondingly, based on the feature extraction layer, the feature information of the target part image is extracted, and the target part feature map output by each sub-feature extraction layer is obtained, specifically including:

[0114] Based on the initial feature extraction layer, feature information of the target part image is extracted to determine the initial part feature map;

[0115] Based on the first sub-feature extraction layer, the initial part feature map is transformed according to the channel attention mechanism and the spatial attention mechanism to determine the first target part feature map;

[0116] Input the feature map of the first target part into the second sub-feature extraction layer;

[0117] Repeat the above steps of transforming according to the channel attention mechanism and spatial attention mechanism to determine the target part feature map, and input the target part feature map into the next sub-feature extraction layer, until the target part feature map output by each sub-feature extraction layer is obtained.

[0118] Specifically, an attention mechanism is added to the backbone feature extraction network to enhance the model's feature representation and improve its feature extraction and target localization capabilities in the detection of surface defects in industrial parts.

[0119] The feature extraction layer also includes an initial feature extraction layer, which is placed before all sub-feature extraction layers. Taking the improved Mask R-CNN object detection model as an example, the sub-feature extraction layers employ a hybrid-domain convolutional block attention module (CBAM) and add it to the improved Mask R-CNN model to enhance the neck of feature extraction.

[0120] During feature extraction, the image of the target part to be tested is input into the feature extraction layer. Based on the initial feature extraction layer, the feature information of the target part image is extracted to determine the initial part feature map. Then, the initial part feature map is input into the first sub-feature extraction layer, and the initial part feature map is transformed according to the channel attention mechanism and the spatial attention mechanism through CBAM.

[0121] CBAM is composed of a channel attention module (CAM) and a spatial attention module (SAM) connected in series, with the channel attention module preceding the spatial attention module.

[0122] Compared to the channel-domain attention SE (Squeeze-and-Excitation) module, CBAM adds a spatial domain attention mechanism, emphasizing meaningful features in both spatial and channel dimensions. At the same time, global max pooling is added during implementation, and using two different pooling methods means that higher-level and richer features can be extracted.

[0123] Figure 5 This is a schematic diagram of the hybrid domain convolutional block attention module provided by the present invention. The specific implementation of CBAM is as follows: Figure 5 As shown, it can be mainly divided into two major steps. First, the input initial part feature map is processed by the channel attention mechanism module, which outputs the corresponding weight values w. c Multiply it by the input to obtain the input features for the second step.

[0124] In the CAM structure, global average pooling and global max pooling are first performed on each input feature layer. Then, two fully connected layers are used for processing. The spatial dimension of the output of the first fully connected layer is c / r (where c is the number of channels in the input feature map), and the spatial dimension of the output of the second fully connected layer is c. The number of output feature channels remains unchanged, while the width and height dimensions are 1.

[0125] Finally, the processed average pooling features are added to the max pooling features, and the sum is normalized to (0,1) using the sigmoid function to obtain the channel attention module weight value w. c .

[0126] Then, the second weight w is obtained through the spatial attention mechanism module. s .

[0127] In the SAM structure part, the average and maximum values of pixels at each spatial location (the same location in different feature layers) of the feature map output from the previous step are calculated respectively, so that the width and height dimensions of the two output features remain unchanged and the number of channels is 1.

[0128] Then, the two feature maps are concatenated along the channel dimension, and the concatenated feature map is passed through a regular convolutional layer with a kernel size of 7×7. Finally, in the same way as the CAM part, the previously output feature map is normalized to (0,1) by the sigmoid function to obtain the weight values w of the spatial attention module. s .

[0129] The formula for calculating the sigmoid function is as follows:

[0130]

[0131] Finally, w s Multiplying the input feature map from the second step yields the final first target part feature map.

[0132] The first target part feature map is input into the second sub-feature extraction layer. Each sub-feature extraction layer repeats the above steps of transforming according to the channel attention mechanism and spatial attention mechanism, determining the target part feature map, and inputting the target part feature map into the next sub-feature extraction layer, until the target part feature map output by each sub-feature extraction layer is obtained (the number is the same as the number of sub-feature extraction layers).

[0133] The industrial part image defect detection method provided by this invention extracts an initial part feature map by setting an initial feature extraction layer in the feature extraction layer, and then sequentially linking multiple sub-feature extraction layers. The sub-feature extraction layers transform the initial part feature map according to channel attention and spatial attention mechanisms. Compared with the traditional method of only extracting feature maps, the feature extraction layer provided by this invention can extract image features at different levels, and adds a spatial domain attention mechanism on this basis, emphasizing meaningful features in both spatial and channel dimensions, extracting higher-level and richer features, and obtaining multiple target part feature maps. This ensures that the feature maps can complement each other, making the feature details richer, and effectively improving the detection effect when detecting information related to defects in industrial parts.

[0134] Optionally, according to the industrial part image defect detection method provided by the present invention, based on the feature fusion layer, the part feature maps output by each sub-feature extraction layer are fused according to the feature pyramid network to determine the target fused feature map, specifically including:

[0135] Based on the feature fusion layer, according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused according to the preset fusion rules to determine multiple fused feature maps at different levels.

[0136] The local response normalization process is used to process multiple fusion feature maps at different levels. The processed fusion feature maps are then fused to determine the target fusion feature map.

[0137] Specifically, let's take the improved Mask R-CNN object detection model as an example. The feature pyramid DetNet network uses regular 3×3 convolution kernels for convolution, and dilated convolutions with a stride of 2. Dilated convolutions separate the pixels being summed in the convolution, but the summed pixels are the same as in regular convolutions. The weights of dilated convolutions in blank areas are 0, and they do not participate in the convolution operation; the effective receptive field is 7×7.

[0138] In the DetNet network architecture, dilated convolutions are introduced to give the model a larger receptive field while avoiding the multiple upsampling steps of the FPN structure. In the Bottleneck structure, 3x3 convolutions with a stride of 2 are replaced with 3x3 convolutions with a dilation of 2, ensuring that the feature map output from each Bottleneck is the same size as the original image. The number of channels is 256, while traditional backbones typically have decreasing feature map size and increasing number of channels.

[0139] When constructing the feature pyramid, since the feature maps are all the same size, they can be directly passed from right to left and added together, avoiding upsampling. In order to further fuse the features of each channel, the output of each stage needs to be convolved with 1x1 and then added to the features returned from the previous stage.

[0140] Based on the feature fusion layer, according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused according to the preset fusion rules to determine multiple fused feature maps at different levels.

[0141] The feature pyramid structure unifies the number of channels of the four feature maps output from the last four stages of ResNet101 through 1x1 convolutional kernels with 256 channels. Then, it designs a feature network with three layers: shallow, medium and deep, and integrates features from three different layers to achieve complementary advantages. It adopts the design principle of HyperNet, performs max pooling on the shallow features and deconvolution on the deep features, so that the resolution of both is half of the original image, the same as the resolution of the middle layer.

[0142] It is understood that in this example, the preset fusion rule is to sort the five output target part feature maps according to their size, and then use pooling and deconvolution to form a depthwise feature map for the first three feature maps, while the last two feature maps each form a depthwise feature map, resulting in three levels of fused feature maps: shallow, medium, and deep. Furthermore, in practical applications of this invention, the number of output feature maps can be adjusted according to actual needs, and the corresponding preset fusion rule can also be set according to actual needs; this invention does not impose any limitations on this.

[0143] After determining multiple fusion feature maps at different levels, a Local Response Normalization (LRN) layer is first used to process the fusion feature maps, increasing their generalization ability and smoothing them. The processed fusion feature maps are then fused to determine the target fusion feature map. This method of fusing shallow and deep features increases the mapping resolution of small targets, resulting in richer feature details and effectively improving the detection performance of small targets.

[0144] The present invention provides an industrial part image defect detection method. Based on a feature pyramid network, it fuses the part feature maps output from each sub-feature extraction layer by setting preset fusion rules to determine multiple fused feature maps at different levels. Local response normalization is then applied to these multiple fused feature maps, which are then fused together to determine the target fused feature map. This fusion method increases the receptive field while obtaining a larger feature map size, which is beneficial for object localization and improves the detection effect of minute defects.

[0145] Optionally, in the industrial part image defect detection method provided by the present invention, the defect recognition layer includes: a bounding box determination layer and a defect recognition layer;

[0146] Correspondingly, based on the defect recognition layer, the defect category and location information of the target defect are determined in the target part image according to the target fusion feature map, specifically including:

[0147] Based on the bounding box determination layer, the bounding box of the target defect is determined by the RoIAlign region feature aggregation algorithm according to the target fusion feature map, and the location information of the target defect is obtained.

[0148] Based on the feature information of the target defect bounding box region, the target defect is classified to determine the defect category.

[0149] Specifically, to improve the accuracy of surface defect segmentation and localization in industrial parts, the Region Proposal Network (RPN) is optimized, and the anchor point generation method is improved. The defect recognition layer includes a bounding box determination layer and a defect recognition layer. When recognizing a target defect, the bounding box must first be determined. Based on the target fusion feature map, the RoIAlign region feature aggregation algorithm is used to determine the bounding box and obtain the target defect location information.

[0150] To make the anchor boxes more consistent with the scale of the defect targets, the aspect ratio of the defect target bounding boxes on the surface of each sample is analyzed, and the length, width, and scaling ratio of the anchor point generation method are improved. Based on the image area size feature information, the generated anchor boxes can be more closely matched with the defect objects. The segmentation network transforms the input defect image into a segmentation-based pixel-level prediction task, and uses a compact CNN convolutional neural network for classification.

[0151] After obtaining the segmentation results of all possible defects, blob analysis is further used to find the accurate defect contour. The minimum bounding rectangle region based on the defect contour is extracted from the final image. The minimum bounding rectangle accurately reflects the defect envelope region, making the input to the classification module more accurate and easier.

[0152] Because the smallest closed rectangle has a random orientation, an affine transformation is used to convert the slanted rectangle into a symmetrical rectangle. A symmetrical rectangle is then considered as a region of interest (ROI), and the final defect region is defined by these ROIs. The ROI, or target defect bounding box, is then cropped from the original image to obtain the target defect location information.

[0153] In the improved Mask R-CNN, the region feature aggregation algorithm RoI Align replaces the region-of-interest pooling (ROIPooling) algorithm. While RoIPooling uses nearest-neighbor interpolation, RoIAlign uses bilinear interpolation. Because of the rounding operation in RoIPooling, some precision is lost, which reduces the accuracy of the segmentation task. RoI Align eliminates the rounding operation, retaining all floating-point values, and then obtains the values of multiple sampling points through bilinear interpolation. Finally, it performs maximum pooling on the two sampling points to obtain the final value of that point. Due to the use of sampling points and the retention of floating-point values, RoIAlign achieves better performance.

[0154] After determining the target defect bounding box, a Mask branch is added to the original classification and regression to predict the category of each pixel. An independent thresholding module is added at the end of the network to further refine the prediction results, transforming pixel thresholding into a probabilistic approach, and assigning a given threshold G to the final prediction mask. S According to the threshold G S Classify the target defects to determine their defect categories.

[0155] Threshold formula:

[0156]

[0157] I f I pm G represents the final image after binarization and the image for the prediction task, respectively. s To refine the threshold, G is used during network training. s This is the only threshold in the architecture that needs adjustment. In I f In the image, pixels with a grayscale value of 0 represent defective areas, and pixels with a grayscale value of 1 represent non-defective areas.

[0158] Taking the improved Mask R-CNN object detection model as an example, in practical applications, when determining the target defect bounding box, Mask R-CNN by default generates a set of anchor boxes using combinations of three aspect ratios (0.5, 1, 2) and three scaling ratios (8, 16, 32), meaning a set of anchor boxes consists of 9 anchor boxes. However, since the metal surface defect dataset studied in this invention often contains defect targets with small scales and large aspect ratio differences, the anchor boxes generated by the default anchor point generation method have a large scale difference from the target defect, and cannot match the target scale well.

[0159] This invention uses data analysis tools to analyze the aspect ratio and other features of the defect targets in a metal defect dataset, obtaining anchor boxes that are close to the target scale. Finally, an anchor box generation method is determined to generate a set of anchor boxes (a set of anchor boxes consists of 10 anchor boxes) by combining five aspect ratios (0.2, 0.5, 1.0, 2.0, 5) and two scaling ratios (2, 8).

[0160] It should be noted that the specific method for determining the target defect aiming frame and the size setting of the aiming frame can be adjusted according to actual needs, and this invention does not limit them.

[0161] The industrial part image defect detection method provided by this invention replaces the region of interest pooling (RoI) with the region feature aggregation algorithm RoIAlign, eliminating the rounding operation and retaining all floating points. Then, the values of multiple sampling points are obtained through bilinear interpolation, and the final value of the point is obtained by pooling the maximum value of two sampling points. Due to the use of sampling points and the retention of floating points, RoI Align achieves better performance and improves the accuracy of surface defect detection of industrial parts.

[0162] Optionally, according to the industrial part image defect detection method provided by the present invention, training a defect detection model based on a part defect sample dataset and determining a target defect detection model specifically includes:

[0163] A defect detection model is trained based on a dataset of defective parts samples.

[0164] Based on the target loss function, the network parameters of the defect detection model are updated according to the AdaGrad algorithm, and the defect detection model is iteratively trained based on the updated network parameters until the defect detection model converges, thus determining the target defect detection model.

[0165] The formula for the target loss function is as follows:

[0166]

[0167] In the formula, p is the set of positive samples; l RS(i) represents the sum of the current rank error and the current sort error; This is the sum of the target rank error and the target sort error.

[0168] Specifically, after determining the part defect sample dataset and the network structure of the defect detection model, it is necessary to input the part defect sample dataset into the model and train the model.

[0169] Understandably, to further ensure the accuracy of the trained target defect detection model, the sample dataset can be divided into a training set, a validation set, and a test set according to a preset ratio (e.g., 8:1:1) during model training, and the model with the best performance and the best generalization ability can be selected.

[0170] Taking the improved Mask R-CNN object detection model as an example, the training process of the model is explained. The model training includes: network parameter initialization, setting training parameters, loading training data, and iterative training.

[0171] In network parameter initialization, the ResNet101 model is used to extract feature information from the input defect images. Training parameters are set, such as an initial learning rate of 0.005, learning momentum of 0.9, and weight decay coefficient of 0.0005, using the AdaGrad optimizer. The model is then trained using a dataset of defective parts.

[0172] In iterative training, based on the target loss function, the AdaGrad (Adaptive Gradient) algorithm is used to iteratively train the improved network structure. Through continuous iteration, the optimal solution of the network is obtained.

[0173] It is understood that the condition for stopping iteration can be that the number of iterations exceeds a preset iteration threshold, or that the objective function meets a preset condition. This can be set according to actual needs, and the present invention does not limit this.

[0174] To improve the training performance of the model, this invention optimizes the loss function. The total loss of the Mask R-CNN model is a weighted sum of the classification loss and the regression loss. Specifically, the improvement method is to replace the cross-entropy loss in the classification loss with RanK&Sort(RS)Loss, and to use GIOU loss for the regression loss. The weighting parameter of the total loss of the improved model is RSLoss divided by the regression loss. The formula for calculating RSLoss is as follows:

[0175]

[0176] In the formula: p is the set of positive samples; This is the sum of the current rank error and the current sort error, i.e., the current RS error; This is the sum of the target rank error and the target sort error, i.e., the target RS error.

[0177] l RS The formula for calculating (i) is as follows:

[0178]

[0179] In the formula, x ij For logits s i and s j The difference; y j For consecutive labels (e.g., Intersection over Union (IOU)); N FP (i) represents the number of negative samples with scores greater than the positive sample; rank(i) represents the number of positive and negative samples with scores greater than or equal to the positive sample; H(x) ij ) is the unit step function.

[0180] The calculation formula is as follows:

[0181]

[0182] In the formula, The target loss is 0; H(x) ij ) is the unit step function; y * For consecutive labels (e.g., Intersection over Union (IOU)).

[0183] RS Loss consists of two parts: Rank and Sort. Rank refers to distinguishing between positive and negative samples based on the classification score, so that all positive samples are ranked before negative samples. Sort refers to sorting the positive samples in descending order based on the IoU values in the range of 0-1 obtained from consecutive labels, so that different positive samples have different priorities during training.

[0184] RS Loss sorts not only positive and negative samples, but also among positive samples. This feature eliminates the need for an additional quality evaluation branch for object detection boxes during training.

[0185] Since RS Loss balances subtasks such as classification and regression by considering the loss value, there is no need to repeatedly adjust hyperparameters during model training. Only the learning rate needs to be adjusted to continuously improve model performance.

[0186] After model training, the performance evaluation metrics of target detection algorithms, such as Average Precision (AP) and Mean Average Precision (mAP), can be used to evaluate the trained model and determine whether it meets the requirements. If it does, the target defect detection model is determined.

[0187] It is understood that the above training method is only used as an example to illustrate the training process of the model of the present invention. In the actual application of the present invention, the training method of the model can be adjusted according to actual needs, and the present invention does not limit it.

[0188] The industrial part image defect detection method provided by this invention expands the sample data using SSD data augmentation and semi-supervised data augmentation methods to determine the part defect sample dataset. This ensures a balanced number of defect category samples in the dataset, effectively solving the problem of poor model training performance caused by low sample size and extreme imbalance between positive and negative samples. It also effectively prevents overfitting during model training and enhances the model's judgment and reasoning capabilities. The trained target defect detection model is then used to detect the target part image, determining the defect category and location information, effectively improving the accuracy of surface defect detection in industrial parts and reducing the false negative and false positive rates.

[0189] Figure 6 This is a schematic diagram of the industrial part image defect detection system provided by the present invention, as shown below. Figure 6 As shown, the present invention also provides an industrial parts image defect detection system, including: a sample determination unit 601, a model determination unit 602 and a defect detection unit 603;

[0190] The sample determination unit 601 is used to determine a part defect sample dataset based on several part surface defect images, according to the SSD data augmentation method and the semi-supervised data augmentation method; wherein the number of each defect category in the part defect sample dataset is balanced.

[0191] The model determination unit 602 is used to train a defect detection model based on a part defect sample dataset and determine the target defect detection model.

[0192] The defect detection unit 603 is used to input the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested.

[0193] Specifically, before training the defect detection model, it is necessary to collect images of surface defects on the parts and construct a sample dataset of part defects.

[0194] Taking a metal commutator as an example, the steps for acquiring surface defect images of the part are explained. A CCD high-definition digital camera (16 million pixels, 60fps acquisition frame rate, and 1080P acquisition resolution) is used in an industrial production workshop to acquire various defect images of the metal commutator surface under different conditions and magnifications. The images are stored according to the naming method of the defect array, and a set of images for each type of defect is constructed. The images are grouped, feature points are extracted from each image, and each pixel is segmented to finally obtain several high-precision surface defect images of the part.

[0195] It is understood that the above-mentioned types of industrial parts and specific methods for obtaining images of surface defects are only used as specific examples to illustrate the present invention. In the actual application of the present invention, the types of industrial parts and the methods for obtaining images can be adjusted according to actual needs, and the present invention does not limit them.

[0196] After acquiring several surface defect images of parts, the images need to be preprocessed and expanded. The sample determination unit 601 is used to expand the images based on several surface defect images of parts according to the SSD (Single Shot Multibox Detector) data augmentation method and the semi-supervised data augmentation method to balance the number of each defect category. The images in the augmented dataset are labeled to determine the part defect sample dataset.

[0197] SSD data augmentation performs optical transformations (randomly adjusting brightness, contrast, hue, saturation, and channels) and geometric transformations (random expansion, cropping, mirroring, and scaling to a fixed ratio) on the input image data, followed by mean removal. This effectively increases the diversity of scale samples, improves the network's robustness to target scales, and enhances the detection accuracy of small targets and the detection performance of occluded objects.

[0198] Semi-supervised data augmentation analyzes information such as the aspect ratio and area of the defect annotation boxes in each image of the data sample. This preserves the original label information, enhances the diversity of the data, effectively prevents overfitting, and improves the model's judgment and reasoning ability.

[0199] In practical applications, the open-source LabelImg software can be compiled and generated under the Windows 10 64-bit operating system environment. This software can be used to manually annotate defects, ensuring that each defect is centered within the annotation box. After annotation, a txt or xml file is saved, containing the center coordinates of the defect image and its relative width and height, thus annotating the augmented dataset. In addition, other annotation methods can be selected according to the actual situation; this invention does not limit these methods.

[0200] It should be noted that the specific steps of data augmentation achieved by the SSD data augmentation method and semi-supervised data augmentation method in this invention, as well as the specific methods for balancing sample categories (such as setting the augmentation weight according to the number of sample categories during data expansion, or expanding samples of different categories to a preset number, etc.), can be adjusted according to the actual situation, and this invention does not limit them.

[0201] After determining the part defect sample dataset, the model determination unit 602 is used to train the defect detection model based on the part defect sample dataset and determine the target defect detection model.

[0202] It is understandable that the model structure needs to be determined before model training. The specific structure and training method of the defect detection model can be set according to actual needs, and this invention does not limit this.

[0203] After obtaining the trained target defect detection model, the defect detection unit 603 is used to input the image of the target part to be tested into the target defect detection model, detect relevant information of industrial part defects, and determine the defect category and location information of the target part to be tested.

[0204] It is understood that the specific classification of part defect categories in this invention can be set according to actual needs, and this invention does not limit this.

[0205] The industrial parts image defect detection system provided by this invention expands the sample data using SSD data augmentation and semi-supervised data augmentation methods to determine the part defect sample dataset. This ensures a balanced number of defect category samples in the dataset, effectively solving the problem of poor model training performance caused by low sample size and extreme imbalance between positive and negative samples. It also effectively prevents overfitting during model training and enhances the model's judgment and reasoning capabilities. The trained target defect detection model is then used to detect images of target parts to determine the defect category and location information, effectively improving the accuracy of surface defect detection in industrial parts and reducing the false negative and false positive rates.

[0206] It should be noted that the industrial part image defect detection system provided by the present invention is used to execute the above-mentioned industrial part image defect detection method, and its specific implementation method is the same as the method implementation method, and will not be repeated here.

[0207] Figure 7 This is a schematic diagram of the physical structure of the electronic device provided by the present invention, such as... Figure 7As shown, the electronic device may include a processor 701, a communication interface 702, a memory 703, and a communication bus 704. The processor 701, communication interface 702, and memory 703 communicate with each other via the communication bus 704. The processor 701 can call logical instructions in the memory 703 to execute an industrial part image defect detection method. This method includes: determining a part defect sample dataset based on several part surface defect images, using SSD data augmentation and semi-supervised data augmentation methods; wherein the number of each defect category in the part defect sample dataset is balanced; training a defect detection model based on the part defect sample dataset to determine a target defect detection model; and inputting the target part image into the target defect detection model to determine the defect category and location information of the target part.

[0208] Furthermore, the logical instructions in the aforementioned memory 703 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0209] On the other hand, the present invention also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium. The computer program includes program instructions, and when the program instructions are executed by a computer, the computer is able to execute the industrial part image defect detection method provided by the above methods. The method includes: determining a part defect sample dataset based on several part surface defect images, according to SSD data augmentation and semi-supervised data augmentation; wherein the number of each defect category in the part defect sample dataset is balanced; training a defect detection model based on the part defect sample dataset to determine a target defect detection model; and inputting a target part image to be tested into the target defect detection model to determine the defect category and location information of the target part.

[0210] In another aspect, the present invention also provides a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the aforementioned methods for detecting defects in industrial part images. The method includes: determining a part defect sample dataset based on several part surface defect images, using SSD data augmentation and semi-supervised data augmentation methods; wherein the number of each defect category in the part defect sample dataset is balanced; training a defect detection model based on the part defect sample dataset to determine a target defect detection model; and inputting a target part image into the target defect detection model to determine the defect category and location information of the target part.

[0211] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0212] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods of various embodiments or some parts of embodiments.

[0213] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for detecting defects in images of industrial parts, characterized in that, include: Based on several surface defect images of parts, a part defect sample dataset is determined according to the SSD data augmentation method and the semi-supervised data augmentation method; wherein, the number of each defect category in the part defect sample dataset is balanced. The process of determining a component defect sample dataset based on several surface defect images of components, using SSD data augmentation and semi-supervised data augmentation, specifically includes: Based on images of surface defects of several parts, a labeled extended dataset is determined according to an optimized SSD data augmentation method; wherein, the optimized SSD data augmentation method incorporates the Mosaic algorithm into the original SSD data augmentation method. The labeled extended dataset is processed using a semi-supervised data augmentation method to determine the part defect sample dataset. A defect detection model is trained based on the defect sample dataset of the parts, and a target defect detection model is determined. The image of the target part to be tested is input into the target defect detection model to determine the defect category and location information of the target part to be tested; The defect detection model includes: a feature extraction layer, a feature fusion layer, and a defect recognition layer; the feature extraction layer includes multiple sub-feature extraction layers; the sub-feature extraction layers are connected sequentially, and the output of the previous sub-feature extraction layer is the input of the next sub-feature extraction layer; Correspondingly, the step of inputting the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested specifically includes: The image of the target part to be tested is input into the feature extraction layer of the target defect detection model; Based on the feature extraction layer, the image feature information of the target part to be tested is extracted, and the target part feature map output by each sub-feature extraction layer is obtained. Based on the feature fusion layer, and according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused to determine the target fused feature map; Based on the defect identification layer, and according to the target fusion feature map, the defect category and location information of the target defect are determined in the target part image to be tested.

2. The method for detecting defects in industrial parts images according to claim 1, characterized in that, The feature extraction layer further includes: an initial feature extraction layer; the initial feature extraction layer is positioned before all sub-feature extraction layers; Correspondingly, the step of extracting the feature information of the target part image based on the feature extraction layer and obtaining the target part feature map output by each sub-feature extraction layer specifically includes: Based on the initial feature extraction layer, the image feature information of the target part to be tested is extracted to determine the initial part feature map; based on the first sub-feature extraction layer, the initial part feature map is transformed according to the channel attention mechanism and the spatial attention mechanism. Drawings were collected to determine the feature map of the first target part; The feature map of the first target part is input into the second sub-feature extraction layer; Repeat the above steps of transforming according to the channel attention mechanism and spatial attention mechanism to determine the target part feature map, and input the target part feature map into the next sub-feature extraction layer, until the target part feature map output by each sub-feature extraction layer is obtained.

3. The method for detecting defects in industrial parts images according to claim 1, characterized in that, Based on the feature fusion layer, and according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused to determine the target fused feature map, specifically including: Based on the feature fusion layer, according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused according to the preset fusion rules to determine multiple fused feature maps at different levels. The local response normalization process is used to fuse the multiple fusion feature maps at different levels, and the processed fusion feature maps are then fused to determine the target fusion feature map.

4. The method for detecting defects in industrial parts images according to claim 1, characterized in that, The defect identification layer includes: a frame determination layer and a defect identification layer; Correspondingly, based on the defect identification layer and according to the target fusion feature map, the defect category and location information of the target defect are determined in the target part image, specifically including: Based on the bounding box determination layer, and according to the target fusion feature map, the RoIAlign region feature aggregation algorithm is used to determine the target defect bounding box and obtain the target defect location information. Based on the feature information of the target defect aiming frame region, the target defect is classified to determine the defect category.

5. The method for detecting defects in industrial parts images according to claim 1, characterized in that, The step of training a defect detection model based on the part defect sample dataset and determining a target defect detection model specifically includes: The defect detection model is trained based on the defect sample dataset of the parts. Based on the target loss function, the network parameters of the defect detection model are updated according to the AdaGrad algorithm, and the defect detection model is iteratively trained based on the updated network parameters until the defect detection model converges, thereby determining the target defect detection model. The formula for the target loss function is as follows: ； In the formula, p is the set of positive samples; l RS (i) represents the sum of the current rank error and the current sort error. This is the sum of the target rank error and the target sort error.

6. An industrial parts image defect detection system, characterized in that, include: Sample determination unit, model determination unit, and defect detection unit; The sample determination unit is used to determine a part defect sample dataset based on several part surface defect images, according to the SSD data augmentation method and the semi-supervised data augmentation method; wherein, the number of each defect category in the part defect sample dataset is balanced; The process of determining a component defect sample dataset based on several surface defect images of components, using SSD data augmentation and semi-supervised data augmentation, specifically includes: Based on images of surface defects of several parts, a labeled extended dataset is determined according to an optimized SSD data augmentation method; wherein, the optimized SSD data augmentation method incorporates the Mosaic algorithm into the original SSD data augmentation method. The labeled extended dataset is processed using a semi-supervised data augmentation method to determine the part defect sample dataset. The model determination unit is used to train a defect detection model based on the part defect sample dataset and determine the target defect detection model. The defect detection unit is used to input the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested; The defect detection model includes: a feature extraction layer, a feature fusion layer, and a defect recognition layer; the feature extraction layer includes multiple sub-feature extraction layers; the sub-feature extraction layers are connected sequentially, and the output of the previous sub-feature extraction layer is the input of the next sub-feature extraction layer; Correspondingly, the step of inputting the image of the target part to be tested into the target defect detection model to determine the defect category and location information of the target part to be tested specifically includes: The image of the target part to be tested is input into the feature extraction layer of the target defect detection model; Based on the feature extraction layer, the image feature information of the target part to be tested is extracted, and the target part feature map output by each sub-feature extraction layer is obtained. Based on the feature fusion layer, and according to the feature pyramid network, the part feature maps output by each sub-feature extraction layer are fused to determine the target fused feature map; Based on the defect identification layer, and according to the target fusion feature map, the defect category and location information of the target defect are determined in the target part image to be tested.

7. An electronic device, characterized in that, The device includes a memory and a processor, which communicate with each other via a bus; the memory stores program instructions that can be executed by the processor, and the processor can execute the industrial part image defect detection method as described in any one of claims 1 to 5 by calling the program instructions.

8. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the industrial part image defect detection method as described in any one of claims 1 to 5.