Image cube and deep learning-based vr lens defect detection method and system

By constructing a lightweight convolutional neural network based on image cubes and deep learning, the accuracy and efficiency issues of detecting subtle defects in VR lens modules were solved, achieving efficient defect detection and classification with an accuracy rate of 98%.

CN117197550BActive Publication Date: 2026-06-12HEFEI SHANGJU IND EQUIP +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HEFEI SHANGJU IND EQUIP
Filing Date
2023-08-29
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies cannot effectively detect subtle defects in VR lenses, and manual inspection is costly and inefficient, while automated optical inspection equipment cannot clearly image and classify defects.

Method used

A lightweight convolutional neural network is constructed using an image cube and deep learning approach. The network is trained and optimized using a YOLOv8 network. Defects are extracted by combining gray-scale variance algorithm and gradient technique. The focal loss function is used to optimize the loss function, thereby achieving the detection and classification of subtle defects.

🎯Benefits of technology

It enables the detection and classification of minute defects in VR lens modules of varying thicknesses with high accuracy, achieving a classification accuracy rate of 98%, reducing the cost of manual inspection and improving inspection efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117197550B_ABST
    Figure CN117197550B_ABST
Patent Text Reader

Abstract

The application provides a VR lens defect detection method and system based on an image cube and deep learning. A batch of offline defect sample libraries are collected, a YOLOV8 network is taken as a basis, a network is optimized, a light convolutional neural network is constructed, samples are trained, and an efficient algorithm for image classification, Anchor-Free defect detection and instance segmentation is generated. When detection is performed each time, the extracted defect slice image is given to the trained deep learning classification model, and the detection and classification results can be output. The application can meet the fine defect detection and classification of VR lens modules with different thicknesses, and the accuracy is high.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of 3D visual inspection, and more specifically, to a method and system for detecting defects in VR lenses based on image cubes and deep learning. Background Technology

[0002] VR optical modules are optical components in Virtual Reality (VR) devices, used to adjust and focus images to provide users with a clearer and more realistic visual experience. The entire lens module is made of multiple layers of different optical materials bonded to lenses. During the manufacturing process, the film materials are easily scratched by improper operation of equipment and personnel, and can also be contaminated by environmental foreign objects. Such products will display defective images, resulting in a poor user experience when worn.

[0003] Currently, there are two main inspection methods for VR lens modules. The first is for downstream workers on the production line to inspect for defects using their eyes under bright lights. However, this method is labor-intensive, and continuous high-brightness work can cause irreversible damage to eyesight. Furthermore, eye fatigue occurs after working in high-brightness environments for more than two hours, leading to the missed detection of numerous minute defects. The second method is automated optical inspection. VR optical modules are complex, consisting of multiple layers of different materials, including films and lenses, with curved surfaces and considerable thickness. Existing automated optical inspection equipment often cannot clearly image all defects, especially minute ones, due to camera limitations. Moreover, the types of defects are numerous, and the shapes of various minute defects with low clarity are similar, making it difficult for traditional algorithms to correctly classify different types of defects. Therefore, there is an urgent need to develop a new defect detection method to solve these problems. Summary of the Invention

[0004] To address the aforementioned shortcomings, this invention provides a VR lens defect detection method and system based on image cubes and deep learning. During detection, the extracted defect slice image is fed into a trained deep learning classification model to output the detection and classification results. This invention can accurately detect and classify minute defects in VR lens modules of varying thicknesses.

[0005] In a first aspect, the present invention provides a VR lens defect detection method based on image cubes and deep learning, which includes the following steps:

[0006] S1. Select standard images of VR lenses and defect images of different categories of VR lenses under test as training sample sets. Based on the YOLOV8 network, optimize the network and construct a lightweight convolutional neural network to detect defects in VR lenses under test. Then, feed the training sample set into the lightweight convolutional neural network for training.

[0007] S2. Based on a lightweight convolutional neural network, the location and type of defects in the VR lens under test are detected online.

[0008] In one embodiment of the present invention, step S1 includes the following process:

[0009] S11. Place the VR lenses to be tested into the machine and take N images using the autofocus system by using the depth of field descent method to form an image cube with length, width and height of X, Y and N respectively.

[0010] S12. Based on the first image, extract the region of interest of the VR lens to be tested, calculate the origin of the VR lens to be tested, and save it as initialization information to the recipe file.

[0011] S13. Obtain N images of the VR lens to be tested, extract the clearest defects, and save them as defect images;

[0012] S14. Repeat steps S11 and S13 to store no less than 1,000 images of various defects as a training sample set and no less than 200 images of various defects as a verification sample set.

[0013] S15. Optimization and construction of a lightweight convolutional neural network based on YOLOv8;

[0014] S16. After initializing the network parameters, the training sample set is fed into the lightweight convolutional neural network for training, and the validation sample set is used for validation.

[0015] In one embodiment of the present invention, in step S12, the edge contrast is enhanced by contrast stretching, and the region of interest of the VR lens to be tested is extracted by threshold segmentation. At the same time, the VR lens contour is fitted into an ellipse by fitting, and the intersection of the major axis and the minor axis is recorded as the origin of the VR lens, which is the origin of the VR lens defect coordinates.

[0016] In one embodiment of the present invention, in step S12, the recipe file includes one or more of the following: image pixel accuracy, autofocus system descent height for each shot, coordinate origin position, vertex coordinate position, binarization parameters, and region of interest.

[0017] In one embodiment of the present invention, in step S13, the region to be detected is segmented on each image using the region of interest information in the recipe file. The contrast of the image is enhanced using a combination of adaptive histogram equalization and Laplacian pyramid. Then, defects are extracted using feature point clustering. The defect image is cropped with the center of the extracted defect as the center coordinate, and the detected defect information is saved in a structure. Finally, a binary search method is used to traverse all extracted defects D. all When traversing to D i Find defects in other layers that have the same center point coordinates, and use a sharpness algorithm to confirm defect D. i The layer number containing the clearest image will be used to determine the D values ​​of other layers. i Defect-related information was removed, and D was updated. all After traversing the final D all This is a collection of defects in the VR lens module.

[0018] In one embodiment of the present invention, in step S13, a sharpness algorithm based on pixel technology and gradient technology is combined, firstly using a grayscale variance algorithm, as shown in the following formula:

[0019]

[0020]

[0021] Determine the sequence number of the anchor point image, and then use the gradient function to calculate the horizontal, vertical and diagonal gradient values ​​of the anchor point image and the four images above and below it respectively;

[0022] Template operator:

[0023]

[0024] Convolution result:

[0025] f x (x, y) = f(x, y) * K x f y (x, y) = f(x, y) * K y ,

[0026] f α (x, y) = f(x, y) * K α f β (x, y) = f(x, y) * K β ;

[0027] Resolution value:

[0028]

[0029] In one embodiment of the present invention, in step S14, the type of defect includes one or more of scratches, dirt, punctures, lint, foreign matter, and black spots.

[0030] In one embodiment of the present invention, in step S15: a focal loss function is used as the loss function.

[0031] L fl =-(1-p t ) γ log(p t ).

[0032] In one embodiment of the present invention, in step S16, before the training sample set is fed into the lightweight convolutional neural network, the dataset needs to be augmented by image stretching, rotation, translation, or Gaussian filtering. Multiple iterations are set to train multiple models with a classification accuracy of over 98% in the validation sample set. The models are loaded into memory, and the input image can be used for classification prediction. When the classification results output by multiple training models are different, the prediction result with the highest score is selected as the final result.

[0033] Secondly, the present invention provides a VR lens defect detection system based on image cubes and deep learning, comprising: an image acquisition component for acquiring a standard image of a VR lens and defect images of different categories of the VR lens to be tested; and a detection component for constructing a lightweight convolutional neural network for detecting defects in the VR lens to be tested, and determining the location and category of the defects in the VR lens to be tested.

[0034] In one embodiment of the present invention, the detection component includes: an image preprocessing module, used to process different types of defect images of VR lenses through image stretching, rotation, translation, or Gaussian filtering to enhance training data; a backbone network module, which uses CSPDAKNET53 as the base network, prunes each layer of the network, optimizes parameters, and iteratively prunes the same parameter multiple times to optimize parameters, thereby optimizing feature extraction; and a neck network module, which uses CSPN-FPN as the neck network, used to process feature maps of different scales... The system integrates a Head network module, which uses the YOLOv8 head and includes a classification head and a localization head. The classification head is designed as a decoupled head and contains three 3x3 convolutional layers. The localization head uses an anchor-free design to directly predict the center point and bounding box of the target and provide a confidence score. The comprehensive judgment module feeds the training sample set into the constructed network and sets multiple iterations to train various models with a classification accuracy of over 98% on the validation sample set. When the image acquisition component feeds the target defect into multiple networks, the prediction result with the highest confidence score given by different networks is selected as the final result.

[0035] In summary, this invention provides a VR lens defect detection method and system based on image cubes and deep learning. The beneficial effects of this invention are:

[0036] This invention collects a batch of offline defect sample libraries, optimizes the YOLOv8 network, and constructs a lightweight convolutional neural network. The samples are then used to train the network, generating efficient algorithms for image classification, anchor-free defect detection, and instance segmentation. Each time a defect is detected, the extracted defect slice image is fed into the trained deep learning classification model to output the detection and classification results. This invention can accurately detect and classify subtle defects in VR lens modules of varying thicknesses. Attached Figure Description

[0037] Figure 1 The flowchart shown is a VR lens defect detection method based on image cubes and deep learning provided in Example 1.

[0038] Figure 2 The image shown is a cube fusion map obtained by online detection of the VR lens under test using a lightweight convolutional neural network in Example 1. Detailed Implementation

[0039] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention. Therefore, the following detailed description of the embodiments of the present invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to represent selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0040] Example 1

[0041] like Figure 1 As shown, this embodiment provides a VR lens defect detection method based on image cubes and deep learning, including the following steps:

[0042] S1. Select standard images of VR lenses and defect images of different categories of VR lenses under test as training sample sets. Based on the YOLOV8 network, optimize the network and construct a lightweight convolutional neural network to detect defects in VR lenses under test. Then, feed the training sample set into the lightweight convolutional neural network for training.

[0043] Step S1 includes the following process:

[0044] S11. Place the VR lenses to be tested into the machine and take N images using the autofocus system by using a depth-of-field descent method.

[0045] S12. Based on the first image, extract the region of interest and non-detection region of the VR lens to be tested, and calculate the origin of the VR lens to be tested. Save it as initialization information in the formula file for use in subsequent batch testing.

[0046] Specifically, in step S12, edge contrast is enhanced by contrast stretching, and the region of interest and non-detection region of the VR lens under test are extracted by threshold segmentation.

[0047] Furthermore, the VR lens profile is fitted into an ellipse through fitting, and the intersection of the major axis and the minor axis is calculated and recorded as the origin of the VR lens, which is also the origin of the VR lens defect coordinates.

[0048] Furthermore, the formulation file includes parameters such as image pixel accuracy, autofocus system descent height for each shot, origin position, vertex coordinate position, binarization parameters, and region of interest.

[0049] S13. Obtain N images of the VR lens to be tested, extract the clearest defects, and save them as defect images with a size of 640*640 pixels.

[0050] Specifically, in step S13, the region to be detected is segmented on each image using the region of interest information in the recipe file. The contrast of the image is enhanced by combining adaptive histogram equalization and Laplacian pyramid. Then, the defect is extracted using the feature point clustering method. The defect image is cropped with the center of the extracted defect as the center coordinate. The detected defect information, such as length, width, layer number, and image, is saved in a structure.

[0051] Furthermore, using a binary search method, all extracted defects D are traversed. all When traversing to D i Find defects in other layers that have the same center point coordinates, and use a sharpness algorithm to confirm defect D. i The layer number containing the clearest image will be used to determine the D values ​​of other layers. i Defect-related information was removed, and D was updated. all After traversing the final D all This is a collection of defects in the VR lens module.

[0052] In step S13, a sharpness algorithm based on pixel technology and gradient technology is combined. First, the gray-level variance algorithm is used to determine the sequence number of the anchor point image. The image with the sharpest image has the most high-frequency components. This algorithm uses the average gray-level value of all pixels in the image as a reference, calculates the sum of the squares of the differences in the gray-level values ​​of each pixel, and then normalizes it using the total number of pixels. It represents the average degree of gray-level variation in the image. The greater the average degree of gray-level variation, the sharper the image; the smaller the average degree of gray-level variation, the blurrier the image. The gray-level variance algorithm is as follows:

[0053]

[0054]

[0055] Then, the gradient function is used to calculate the horizontal, vertical, and diagonal gradient values ​​of the anchor point image and the four images above and below. The larger the gradient value, the greater the sharpness.

[0056] Template operator:

[0057]

[0058] Convolution result:

[0059] f x (x, y) = f(x, y) * K x f y (x, y) = f(x, y) * K y ,

[0060] f α (x, y) = f(x, y) * K α f β (x, y) = f(x, y) * K β ;

[0061] Resolution value:

[0062]

[0063] S14. Repeat steps S11 and S13 to store no less than 1,000 images of various defects as a training sample set and no less than 200 images of various defects as a verification sample set.

[0064] Specifically, in step S14, the types of defects include one or more of the following: scratches, dirt, punctures, burrs, foreign matter, and black spots.

[0065] S15. Optimize and build a lightweight convolutional neural network based on YOLOv8.

[0066] Specifically, in step S15: Due to the small size and variety of defects in the VR lens module, and the imbalanced sample size, a suitable loss function was customized, and the focal loss function was adopted as the loss function.

[0067] L fl =-(1-p t ) γ log(p t ), where (1-p t ) γ γ is an adjustment factor, and γ≥0 is an adjustable focusing parameter.

[0068]

[0069] S16. After initializing the network parameters, the training sample set is fed into the lightweight convolutional neural network for training, and the network is validated using a validation sample set. The network achieved an accuracy of 98% on the validation sample set.

[0070] Specifically, in step S16, before feeding the training sample set into the lightweight convolutional neural network, the dataset needs to be augmented by image stretching, rotation, translation, or Gaussian filtering. The augmented data volume is twenty times that of the original dataset. Multiple iterations are set to train various models that achieve a classification accuracy of over 98% on the validation sample set.

[0071] S17. Save the trained network as a .pth file and write the model calling code to deploy it to the C++ environment.

[0072] In this embodiment, three training models with different iteration counts are generated. The models are saved as .pth files, and an interface function is written in C++. This function accepts input data and returns the model's prediction result. During online testing, the project code only needs to call the corresponding function in the libtorch library to load the model into memory. The input image can then be used for classification prediction. When the classification results output by the three models are different, the prediction result with the highest score is selected as the final result, thus effectively improving the classification accuracy again.

[0073] S2. Based on a lightweight convolutional neural network, the location and type of defects in the VR lens under test are detected online.

[0074] For example: A lightweight convolutional neural network is used online to detect the VR lenses under test, resulting in a cube fusion image. Figure 2 As shown in the figure, the location and type of defects were detected, and the detection results are shown in Table 1.

[0075] Table 1 Detection Results

[0076]

[0077] Example 2

[0078] A VR lens defect detection system based on image cubes and deep learning includes: an image acquisition component and a detection component.

[0079] The image acquisition component is used to acquire standard images of the VR lens and defect images of different categories of the VR lens under test.

[0080] The detection component is used to construct a lightweight convolutional neural network for detecting defects in the VR lens under test, and to determine the location and type of the defects.

[0081] The detection components include: an image preprocessing module, a backbone network module, a Neck network module, a Head network module, and a comprehensive judgment module.

[0082] The image preprocessing module is used to process different types of defect images of VR lenses by image stretching, rotation, translation or Gaussian filtering to enhance the training data, which can expand the number of original training images to 10 times.

[0083] The backbone network module uses CSPDAKNET53 as the base network. Each layer of the network is pruned and the parameters are optimized. The same parameter is pruned multiple times through iterative pruning to achieve the optimal parameters, thereby achieving the optimal feature extraction.

[0084] The Neck network module uses CSPN-FPN as the neck network to fuse feature maps of different scales;

[0085] The Head network module uses the YOLOv8 head, which includes a classification head and a localization head. The classification head adopts a decoupled head design and contains three 3*3 convolutional layers. The localization head adopts an anchor-free design, directly predicting the center point and bounding box of the target and providing the confidence score.

[0086] The comprehensive judgment module feeds the training sample set into a lightweight convolutional neural network for the defects of the VR lens to be tested, sets multiple iterations, and trains multiple models with a classification accuracy of over 98% in the validation sample set. When the image acquisition component feeds the target defect into the lightweight convolutional neural network for the defects of the VR lens to be tested, the prediction result with the highest confidence score given by different networks is selected as the final result.

[0087] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. For those skilled in the art, the present invention can have various modifications and variations. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A VR lens defect detection method based on image cubes and deep learning, characterized in that, It includes the following steps: S1. Select standard images of VR lenses and defect images of different categories of VR lenses under test as training sample sets. Based on the YOLOV8 network, optimize the network and construct a lightweight convolutional neural network to detect defects in VR lenses under test. Then, feed the training sample set into the lightweight convolutional neural network for training. Step S1 includes the following process: S11. Place the VR lens to be tested into the machine, and use the autofocus system to capture N images by descending the depth of field along the Z-axis. Stack and align the images to form an image cube with X and Y as the camera resolution and Z as N. S12. Based on the first image, extract the region of interest of the VR lens to be tested, calculate the origin of the VR lens to be tested, and save it as initialization information to the recipe file. S13. Obtain N images of the VR lens to be tested, extract the clearest defects, and save them as defect images; S14. Repeat steps S11 and S13 to store no less than 1,000 images of various defects as a training sample set and no less than 200 images of various defects as a verification sample set. S15. Optimization and construction of a lightweight convolutional neural network based on YOLOv8; S16. After initializing the network parameters, the training sample set is fed into the lightweight convolutional neural network for training, and the validation sample set is used for validation. S2. Based on a lightweight convolutional neural network, the location and type of defects in the VR lens under test are detected online.

2. The VR lens defect detection method based on image cubes and deep learning as described in claim 1, characterized in that, In step S12, the edge contrast is enhanced by contrast stretching, and the region of interest of the VR lens under test is extracted by threshold segmentation. At the same time, the VR lens contour is fitted into an ellipse by fitting, and the intersection of the major axis and the minor axis is recorded as the origin of the VR lens, which is also the origin of the VR lens defect coordinates. In step S12, the recipe file includes one or more of the following: image single-pixel precision, image Z-axis descent height for each image capture, coordinate origin position, vertex coordinate position, binarization parameters, and region of interest.

3. The VR lens defect detection method based on image cubes and deep learning as described in claim 1, characterized in that, In step S13, the region to be detected is segmented on each image using the region of interest information in the recipe file. The contrast of the image is enhanced by combining adaptive histogram equalization and Laplacian pyramid. Then, the defect is extracted using the feature point clustering method. The defect image is cropped with the center of the extracted defect as the center coordinate, and the detected defect information is saved in a structure. Using a binary search method, traverse all extracted defects D. all When traversing to D i Find defects in other layers that have the same center point coordinates, and use a sharpness algorithm to confirm defect D. i The layer number containing the clearest image will be used to determine the D values ​​of other layers. i Defect-related information was removed, and D was updated. all After traversing the final D all This is a collection of defects in the VR lens module.

4. The VR lens defect detection method based on image cubes and deep learning as described in claim 1 or 3, characterized in that, In step S13, the sharpness algorithm based on pixel technology and gradient technology is combined. First, the gray-scale variance algorithm is used, as shown in the following formula: ; Determine the sequence number of the anchor point image, and then use the gradient function to calculate the horizontal, vertical and diagonal gradient values ​​of the anchor point image and the four images above and below it respectively; Template operator: Convolution result: ; Resolution value: .

5. The VR lens defect detection method based on image cubes and deep learning as described in claim 1, characterized in that, In step S14, the type of defect includes one or more of the following: scratches, dirt, spotting, lint, foreign matter, and black spots.

6. The VR lens defect detection method based on image cubes and deep learning as described in claim 1, characterized in that, In step S15: the focal loss function is used as the loss function. ,in, As a regulating factor, For adjustable focusing parameters, .

7. The VR lens defect detection method based on image cubes and deep learning as described in claim 1, characterized in that, In step S16, before the training sample set is fed into the lightweight convolutional neural network, the dataset needs to be augmented by image stretching, rotation, translation, or Gaussian filtering. Multiple iterations are set to train various models with classification accuracy higher than 98% in the validation sample set. The models are loaded into memory, and the input image can be used for classification prediction. When the classification results output by multiple training models are different, the prediction result with the highest score is selected as the final result.

8. A VR lens defect detection system based on image cubes and deep learning, characterized in that, It includes: Image acquisition component, used to acquire standard images of VR lenses and defect images of different categories of VR lenses under test; A detection component is used to construct a lightweight convolutional neural network for detecting defects in the VR lens under test, and to determine the location and type of the defects in the VR lens under test. The detection component includes: The image preprocessing module is used to process different types of defect images of VR lenses through image stretching, rotation, translation, or Gaussian filtering to enhance the training data; The backbone network module uses CSPDAKNET53 as the base network. It prunes each layer of the network, optimizes the parameters, and iterates through multiple prunings of the same parameter to optimize the parameters and thus optimize feature extraction. The Neck network module uses CSPN-FPN as the neck network to fuse feature maps of different scales; The Head network module uses the YOLOv8 head, which includes a classification head and a localization head. The classification head adopts a decoupled head design and contains three 3*3 convolutional layers. The localization head adopts an anchor-free design, directly predicting the center point and bounding box of the target and providing the confidence score. The comprehensive judgment module feeds the training sample set into a lightweight convolutional neural network for the defects of the VR lens to be tested, sets multiple iterations, and trains multiple models with a classification accuracy of over 98% in the validation sample set. When the image acquisition component feeds the target defect into the lightweight convolutional neural network for the defects of the VR lens to be tested, it selects the prediction result with the highest confidence score from the different networks as the final result.