A method and system for semi-supervised transformation detection of wafer defects
By employing a semi-supervised transformation detection method with multiple perturbation consistency regularization, and utilizing unlabeled data for strong and weak data augmentation and feature perturbation consistency learning, the high false detection rate and high cost of manual annotation in wafer defect detection are solved, achieving efficient wafer defect detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGDONG SOLUDA TECHNOLOGY CO LTD
- Filing Date
- 2025-05-27
- Publication Date
- 2026-06-23
AI Technical Summary
Existing wafer defect detection methods based on image processing and deep learning suffer from high false detection rates and high manual annotation costs.
A semi-supervised transformation detection method based on multiple perturbation consistency regularization is adopted. Through semi-supervised training with a small amount of labeled data and unlabeled data, strong and weak data augmentation and feature perturbation consistency learning are performed using unlabeled data to generate perturbed feature maps and calculate loss functions, thereby improving the robustness and generalization ability of the model.
It significantly reduces the cost of manual labeling and improves the accuracy and robustness of wafer defect detection.
Smart Images

Figure CN120198441B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of wafer defect detection technology, and specifically to a method and system for detecting wafer defects based on a semi-supervised transformation with multiple perturbation consistency regularization. Background Technology
[0002] Template-based wafer defect detection is mainly divided into image processing-based techniques and deep learning-based methods.
[0003] Image processing-based techniques mainly involve processing interference signals such as lighting and noise in an image, then comparing the template and defect images to determine the location of the defect.
[0004] Deep learning-based methods extract high-dimensional features from template and defect images and compare them to output the precise location of the defect.
[0005] Image processing-based methods struggle to completely eliminate the effects of lighting, noise, and other interferences, resulting in numerous false positives and limited practicality. Deep learning-based methods, on the other hand, require labeling a large number of defects, leading to higher costs. Summary of the Invention
[0006] The objective of this invention is achieved through the following technical solutions.
[0007] To address the above shortcomings, this invention proposes a semi-supervised transformation method for detecting wafer defects based on multiple perturbation consistency regularization. By labeling a small amount of data and training unlabeled data in a semi-supervised manner, the method significantly reduces manual labeling costs and improves algorithm performance.
[0008] Specifically, according to a first aspect of the present invention, a method for detecting wafer defects using a semi-supervised transformation is provided, based on multiple perturbation consistency regularization, comprising:
[0009] The wafer image is processed to obtain labeled wafer data and unlabeled wafer data;
[0010] A change detection network is constructed, and the labeled wafer data is input into the change detection network for supervised training to obtain the trained model.
[0011] We perform strong and weak data augmentation on unlabeled wafer data and learn the consistency of image strength variation.
[0012] Multiple perturbations are applied to the intermediate features of the unlabeled wafer data to generate perturbed feature maps, and the feature perturbation consistency loss function is calculated.
[0013] Calculate the total loss function of the trained model to obtain the final model;
[0014] Inputting unlabeled wafer data into the final model yields wafer defect detection results.
[0015] According to a second aspect of the present invention, a system for detecting wafer defects using a semi-supervised transform, based on multi-perturbation consistency regularization, is also provided, comprising:
[0016] The annotation module is used to process wafer images to obtain annotated wafer data and unannotated wafer data;
[0017] The training module is used to construct a change detection network. The labeled wafer data is input into the change detection network for supervised training to obtain the trained model.
[0018] The data augmentation module is used to perform strong and weak data augmentation on unlabeled wafer data and to learn the consistency of image strength changes.
[0019] The perturbation consistency module is used to perform various perturbations on the intermediate features of unlabeled wafer data, generate perturbed feature maps, and calculate the feature perturbation consistency loss function.
[0020] The final model module is used to calculate the total loss function of the trained model to obtain the final model;
[0021] The result acquisition module is used to input unlabeled wafer data into the final model to obtain wafer defect detection results.
[0022] The advantages of this invention are: it introduces semi-supervised learning technology, which significantly reduces the cost of manual labeling; and through multi-perturbation consistency regularization, it can more effectively utilize unlabeled data, thereby improving the accuracy and robustness of wafer change detection. Attached Figure Description
[0023] Various other advantages and benefits will become apparent to those skilled in the art upon reading the following detailed description of preferred embodiments. The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0024] Figure 1 A schematic diagram of a supervised training process according to an embodiment of the present invention is shown.
[0025] Figure 2 A schematic diagram illustrating the process of calculating the consistency loss of strength variation according to an embodiment of the present invention is shown.
[0026] Figure 3 A schematic diagram illustrating the characteristic perturbation consistency loss calculation process according to an embodiment of the present invention is shown.
[0027] Figure 4 A system configuration diagram for detecting wafer defects using a semi-supervised transform according to an embodiment of the present invention is shown.
[0028] Figure 5 A schematic diagram of the structure of an electronic device provided in an embodiment of the present invention is shown.
[0029] Figure 6 A schematic diagram of a storage medium provided in an embodiment of the present invention is shown. Detailed Implementation
[0030] Exemplary embodiments of the present disclosure will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
[0031] This invention performs strength enhancement on unlabeled wafer data and learns the consistency of image strength variation; it applies various perturbations to the intermediate features of the unlabeled wafer data to generate perturbed feature maps and calculates the feature perturbation consistency loss function. Specifically, this invention includes the following steps:
[0032] The wafer image is processed to obtain labeled wafer data and unlabeled wafer data;
[0033] A change detection network is constructed, and the labeled wafer data is input into the change detection network for supervised training to obtain the trained model.
[0034] We perform strong and weak data augmentation on unlabeled wafer data and learn the consistency of image strength variation.
[0035] Multiple perturbations are applied to the intermediate features of the unlabeled wafer data to generate perturbed feature maps, and the feature perturbation consistency loss function is calculated.
[0036] Calculate the total loss function of the trained model to obtain the final model;
[0037] Inputting unlabeled wafer data into the final model yields wafer defect detection results. Example
[0038] For a specific example, one embodiment of the present invention is as follows:
[0039] S1. Scan the wafer image into a grayscale image, and align multiple wafer images to extract the median image I. refThe image is used as a template. Then, the defective image is sliced into 256*256 pixels, and the defective parts of some of the images are annotated using annotation software.
[0040] S2. Divide the data into labeled data pairs ( , ) and unlabeled data pairs ( , ), where A represents the defect image and B represents the corresponding slice of the template image.
[0041] S3. Construct a change detection network N, which consists of a feature extractor (encoder) and a decoder (decoder). The encoder can use the feature extraction layer of ResNet50, and the decoder consists of upsampling operations and multiple convolutions.
[0042] S4, will ( , The final feature map is obtained by inputting it into N. Calculate the cross-entropy between it and the corresponding annotation. This serves as the loss function for supervised training. After supervised training is completed, a pre-trained model M is obtained, such as... Figure 1 As shown.
[0043] .
[0044] S5, will ( , Weak enhancement is performed to obtain image pairs. , (Warnings include random flipping, random scaling, and random cropping).
[0045] S6, will ( , The image pairs were obtained by performing two different strong enhancements. , ), ( , Strong enhancement includes cutmix and Gaussian blur.
[0046] S7, will ( , The input is fed into a pre-trained model M with fixed weights, and the output D of the feature extractor (encoder) is used to... w As intermediate features, the final output feature is P.w , for P w Take the threshold t as the pseudo-label:
[0047] ,
[0048] A value of 1 indicates a defective part, and a value of 0 indicates a non-defective part.
[0049] S8、( , ), ( , The input is fed into M to obtain the output feature map P. s1 P s2 .
[0050] S9. Calculate the image intensity variation consistency loss function, such as Figure 2 As shown:
[0051] ,
[0052] in This represents the cross-entropy loss function.
[0053] This method aims to enhance the robustness of the model to various data transformations. By applying strong-to-weak consistency constraints at the image level, the model is able to produce consistent feature maps even after the input image has undergone different enhancement operations.
[0054] S10, D w Feature perturbation, including random noise, random dropping, and feature block dropping, is used to obtain D. f D f The input is fed into the decoder of M to obtain the final output feature P. f .
[0055] S11. Calculate the characteristic perturbation consistency loss function, such as Figure 3 As shown:
[0056] ,
[0057] This strategy focuses on improving the model's generalization and robustness. It does this by introducing various perturbations at the feature level and forcing these perturbed features to maintain consistency.
[0058] S12, The total loss function of the model is:
[0059] ,
[0060] Image intensity variation consistency loss function Consistency loss function with feature perturbation Two consistency loss functions can improve the robustness and generalization of the model.
[0061] S13. After obtaining the final model F, the unlabeled image pairs ( , The input is fed into F to obtain the output feature P, which is then binarized using a threshold T.
[0062] ,
[0063] To obtain the final result The portion with a value of 1 represents a defect in the wafer image.
[0064] like Figure 4 As shown, a semi-supervised transformation system for detecting wafer defects, based on multi-perturbation consistency regularization, includes:
[0065] The annotation module 401 is used to process the wafer image to obtain annotated wafer data and unannotated wafer data;
[0066] Training module 402 is used to construct a change detection network. The labeled wafer data is input into the change detection network for supervised training to obtain the trained model.
[0067] The data augmentation module 403 is used to perform strong and weak data augmentation on unlabeled wafer data and to learn the consistency of image strength changes.
[0068] The perturbation consistency module 404 is used to perform various perturbations on the intermediate features of the unlabeled wafer data, generate the perturbated feature map, and calculate the feature perturbation consistency loss function.
[0069] The final model module 405 is used to calculate the total loss function of the trained model to obtain the final model;
[0070] The result acquisition module 406 is used to input unlabeled wafer data into the final model to obtain wafer defect detection results.
[0071] The semi-supervised transformation detection system for wafer defects provided in the above embodiments of the present invention and the semi-supervised transformation detection method for wafer defects provided in the embodiments of the present invention are based on the same inventive concept and have the same beneficial effects as the methods used, run or implemented by their stored applications.
[0072] This invention also provides an electronic device corresponding to the semi-supervised transformation method for detecting wafer defects provided in the foregoing embodiments, for performing the semi-supervised transformation method for detecting wafer defects. This invention is not limited in its embodiments.
[0073] Please refer to Figure 5 This illustrates a schematic diagram of an electronic device provided by some embodiments of the present invention. For example... Figure 5 As shown, the electronic device 20 includes: a processor 200, a memory 201, a bus 202, and a communication interface 203. The processor 200, the communication interface 203, and the memory 201 are connected via the bus 202. The memory 201 stores a computer program that can run on the processor 200. When the processor 200 runs the computer program, it executes the semi-supervised transformation method for detecting wafer defects provided in any of the foregoing embodiments of the present invention.
[0074] The memory 201 may include high-speed random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Communication between this system network element and at least one other network element is achieved through at least one communication interface 203 (which can be wired or wireless), such as the Internet, wide area network, local area network, or metropolitan area network.
[0075] Bus 202 can be an ISA bus, PCI bus, or EISA bus, etc. The bus can be divided into an address bus, a data bus, a control bus, etc. Memory 201 is used to store programs. After receiving an execution instruction, processor 200 executes the program. The semi-supervised transformation method for detecting wafer defects disclosed in any of the foregoing embodiments of the present invention can be applied to processor 200, or implemented by processor 200.
[0076] The processor 200 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed by the integrated logic circuitry in the hardware of the processor 200 or by instructions in software form. The processor 200 may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this invention. The general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this invention can be directly embodied in the execution of a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in memory 201. The processor 200 reads the information in memory 201 and, in conjunction with its hardware, completes the steps of the above method.
[0077] The electronic device provided in this embodiment of the invention and the method for detecting wafer defects by semi-supervised transformation provided in this embodiment of the invention are based on the same inventive concept and have the same beneficial effects as the methods they employ, operate or implement.
[0078] This invention also provides a computer-readable storage medium corresponding to the semi-supervised transformation method for detecting wafer defects provided in the foregoing embodiments. Please refer to [link / reference]. Figure 6 The computer-readable storage medium shown is an optical disc 30, on which a computer program (i.e., a program product) is stored. When the computer program is run by a processor, it executes the semi-supervised transformation method for detecting wafer defects provided in any of the foregoing embodiments.
[0079] It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other optical and magnetic storage media, which will not be elaborated here.
[0080] The computer-readable storage medium provided in the above embodiments of the present invention and the semi-supervised transformation method for detecting wafer defects provided in the embodiments of the present invention are based on the same inventive concept and have the same beneficial effects as the methods used, run or implemented by the applications stored therein.
[0081] It should be noted that:
[0082] The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general-purpose systems can also be used in conjunction with the teachings herein. The required structure for constructing such systems is apparent from the above description. Furthermore, this invention is not directed to any particular programming language. It should be understood that the contents of the invention described herein can be implemented using various programming languages, and the above description of specific languages is for the purpose of disclosing the best mode of implementation of the invention.
[0083] Numerous specific details are set forth in the specification provided herein. However, it will be understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures, and techniques have not been shown in detail so as not to obscure the understanding of this specification.
[0084] Similarly, it should be understood that, in order to simplify the invention and aid in understanding one or more of the various inventive aspects, in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof. However, this disclosure should not be construed as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as reflected in the following claims, inventive aspects lie in fewer than all features of a single foregoing disclosed embodiment. Therefore, the claims following the detailed description are hereby expressly incorporated into this detailed description, wherein each claim itself is a separate embodiment of the invention.
[0085] Those skilled in the art will understand that modules in the device of the embodiments can be adaptively changed and placed in one or more devices different from that embodiment. Modules, units, or components in the embodiments can be combined into a single module, unit, or component, and further, they can be divided into multiple sub-modules, sub-units, or sub-components. Except where at least some of such features and / or processes or units are mutually exclusive, any combination can be used to combine all features disclosed in this specification (including the accompanying claims, abstract, and drawings) and all processes or units of any method or device so disclosed. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract, and drawings) may be replaced by an alternative feature that serves the same, equivalent, or similar purpose.
[0086] Furthermore, those skilled in the art will understand that although some embodiments described herein include certain features but not others included in other embodiments, combinations of features from different embodiments are intended to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
[0087] The various component embodiments of the present invention can be implemented in hardware, or as software modules running on one or more processors, or a combination thereof. Those skilled in the art will understand that microprocessors or digital signal processors (DSPs) can be used in practice to implement some or all of the functions of some or all of the components in the virtual machine creation system according to embodiments of the present invention. The present invention can also be implemented as a device or system program (e.g., a computer program and computer program product) for performing part or all of the methods described herein. Such programs implementing the present invention can be stored on a computer-readable medium or can be in the form of one or more signals. Such signals can be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
[0088] It should be noted that the above embodiments are illustrative of the invention and not restrictive, and that those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several systems, several of these systems may be embodied by the same item of hardware. The use of the words first, second, and third, etc., does not indicate any order. These words can be interpreted as names.
[0089] The above description is merely a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A method for detecting wafer defects using semi-supervised transformation, based on multi-perturbation consistency regularization, characterized in that, include: The wafer image is processed to obtain labeled and unlabeled wafer data, including: S1, scanning the wafer image into a grayscale image, and aligning multiple wafer images to obtain the median image I. ref Use the template image as a template; then slice the defect image into 256*256 pixels, and annotate the defective parts of some of the images using annotation software; S2, divide the data into annotated data pairs ( , ) and unlabeled data pairs ( , ), where A represents the defect image and B represents the corresponding slice of the template image; Constructing a change detection network, and using the labeled wafer data as input to the change detection network for supervised training to obtain the trained model, includes: S3, constructing a change detection network N, which consists of a feature extractor and a decoder, wherein the feature extractor uses the feature extraction layer of ResNet50, and the decoder consists of upsampling operations and multiple convolutions; S4, ... , The feature map P is obtained by inputting it into N. l And calculate the cross-entropy with the corresponding annotation. As the loss function for supervised training, the pre-trained model M is obtained after supervised training is completed; Strong and weak data augmentation is performed on unlabeled wafer data to obtain intermediate features of the unlabeled wafer data, and image strength variation consistency learning is performed, including: S5, which involves (…). , Weak enhancement is performed to obtain image pairs. , ), where weak enhancement includes random flipping, random scaling, and / or random cropping; S6, will ( , The image pairs were obtained by performing two different strong enhancements. , ), ( , ), where strong enhancement includes cutmix and / or Gaussian blur; S7, will ( , The input is fed into a pre-trained model M with fixed weights, and the output D of the feature extractor is used to... w As intermediate features, the final output feature is P. w , for P w Take the threshold t as the pseudo-label: A value of 1 indicates a defective part, and a value of 0 indicates a non-defective part; S8, ( , ), ( , The input is fed into M to obtain the output feature map P. s1 P s2 S9. Calculate the image intensity variation consistency loss function: ,in Represents the cross-entropy loss function; Multiple perturbations are applied to the intermediate features of the unlabeled wafer data to generate perturbed feature maps, and the feature perturbation consistency loss function is calculated. Calculate the total loss function of the trained model to obtain the final model; Input the unlabeled wafer data into the final model to obtain the wafer defect detection results; The process of applying various perturbations to the intermediate features of unlabeled wafer data to generate a perturbated feature map and calculating the feature perturbation consistency loss function includes: S10, D w Feature perturbation is performed, including: random noise, random dropping, and / or feature block dropping, to obtain D. f D f The input is fed into the decoder of M to obtain the final output feature P. f ; S11. Calculate the feature perturbation consistency loss function: ; The calculation of the total loss function of the trained model to obtain the final model includes: S12, The total loss function of the model is: ; The step of inputting unlabeled wafer data into the final model to obtain wafer defect detection results includes: S13. After obtaining the final model F, the unlabeled image pairs ( , The input is fed into F to obtain the output feature P, which is then binarized using a threshold T. ; To obtain the final result The portion with a value of 1 represents a defect in the wafer image.
2. A system for detecting wafer defects using semi-supervised transformation, employing the method of claim 1, based on multi-perturbation consistency regularization, characterized in that... include: The annotation module is used to process wafer images to obtain annotated wafer data and unannotated wafer data; The training module is used to construct a change detection network. The labeled wafer data is input into the change detection network for supervised training to obtain the trained model. The data augmentation module is used to perform strong and weak data augmentation on unlabeled wafer data and to learn the consistency of image strength changes. The perturbation consistency module is used to perform various perturbations on the intermediate features of unlabeled wafer data, generate perturbed feature maps, and calculate the feature perturbation consistency loss function. The final model module is used to calculate the total loss function of the trained model to obtain the final model; The result acquisition module is used to input unlabeled wafer data into the final model to obtain wafer defect detection results.
3. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, The processor runs the computer program to implement the method as described in claim 1.
4. A computer-readable storage medium having a computer program stored thereon, characterized in that, The program is executed by a processor to implement the method as described in claim 1.