Defect detection method and apparatus, and electronic device and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining language models with image features and benchmark information for wafer grain detection, the shortcomings of deep learning models in detecting complex defect types are overcome, achieving greater detection flexibility and accuracy.

WO2026129805A1PCT designated stage Publication Date: 2026-06-25SUZHOU MEGAROBO TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: SUZHOU MEGAROBO TECH CO LTD
Filing Date: 2025-09-29
Publication Date: 2026-06-25

Application Information

Patent Timeline

29 Sep 2025

Application

25 Jun 2026

Publication

WO2026129805A1

IPC: G06T7/00; G06V10/40; G06V10/75

CPC: Y02P90/30

AI Tagging

Application Domain

Image analysis Character and pattern recognition

Technology Topics

Feature extraction Linguistic model

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A face image face filling method, system and storage medium
CN116311474BThe reconstruction results are accurateComprehensive assessmentCharacter and pattern recognition Feature extraction Computer graphics (images)
Mutli-model gesture to audio translation
WO2026128177A1Acquiring/recognising facial featuresInput/output processes for data processing Feature extractionLearning architecture
An image segmentation method based on cosine consistency screening and double-flow complementary semantic guidance
CN122244075AImage enhancement Image analysisCosine similarityEnhancing Lesion
Video summary generation method, device, apparatus and storage medium
CN122248241ACharacter and pattern recognition Selective content distribution Feature extraction Computer graphics (images)
Intelligent detection method for defects of precision parts based on deep learning
CN122265263AImage analysis Biological models Feature extraction Three-dimensional space

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing deep learning models struggle to adapt to the complex and diverse types of defects when detecting wafer or wafer grain defects, resulting in decreased detection capabilities and difficulty in accurately representing the complex and numerous defects present in wafers or wafer grains.

Method used

A language model is used for detection. By extracting image features of the wafer grains to be tested, combining them with reference image features and query information, detection results are generated. The model parameters are then trained and adjusted to improve accuracy.

Benefits of technology

It improves the flexibility and accuracy of detecting complex and numerous defects in wafers or wafer grains, and enhances the adaptability of the detection results.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025125158_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A defect detection method and apparatus, and an electronic device and a storage medium. The method comprises: performing a feature extraction operation on an image to be subjected to detection of a die to be tested, in order to obtain a first image feature (S110); and on the basis of the first image feature, a reference image feature and first query information, generating a detection result of said die by means of a trained language model (S120). In the method, compared with the approach of using prediction labels outputted by a deep learning model to represent detection results, detection results outputted by a language model are more flexible, thereby facilitating adaptation to scenarios involving complex and numerous defects in wafers or dies. In addition, in the method, a reference image feature is also used as one of the inputs to a language model, so that the language model can focus on differences between a die to be tested and a reference die, thereby facilitating improvement to the accuracy of detection results.

Need to check novelty before this filing date? Find Prior Art

Description

Defect detection methods, devices, electronic equipment, and storage media

[0001] This application claims priority to Chinese Patent Application No. 202411880336.1, filed on December 19, 2024, entitled "Method, Apparatus, Electronic Device and Storage Medium for Defect Detection", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of data processing technology, and more specifically to a defect detection method, a defect detection device, an electronic device, a storage medium, and a computer program product. Background Technology

[0003] In the industrial sector, with the continuous development of production technology, people have placed higher demands on the production quality and efficiency of industrial products. Furthermore, with the development of image processing technology, an increasing number of product inspections are based on images.

[0004] Because wafers or wafer dies may contain a wide variety of defect types, the process of defect detection based on wafer or wafer die images is quite complex. If deep learning models from related technologies are used for wafer or wafer die defect detection, their ability to detect a single defect type decreases as the number of defect types they can detect increases. Furthermore, deep learning models can only output defect types, making it difficult to accurately represent the complex and numerous defects present in wafers or wafer dies. Therefore, how to more accurately detect wafers or wafer dies is a technical problem that urgently needs to be solved by those skilled in the art. Summary of the Invention

[0005] This application is made in view of the above-mentioned problems. This application provides a defect detection method, a defect detection device, an electronic device, a storage medium, and a computer program product.

[0006] According to a first aspect of this application, a method for detecting defects is provided, the method comprising:

[0007] A feature extraction operation is performed on the image of the wafer grains to be tested to obtain the first image features;

[0008] Based on the first image features, the reference image features, and the first query information, the detection result of the wafer grain to be tested is generated through the trained language model. The reference image features are the features obtained by feature extraction from the reference image. The reference wafer grain in the reference image does not have any defective parts. The first query information is used to guide the generation of the detection result, and the detection result is used to describe the defective parts of the wafer grain to be tested.

[0009] In one possible implementation, the detection method further includes:

[0010] A reference image is compared with the image to be tested to determine the region of interest in the image to be tested, wherein the difference between the pixel value of at least one pixel in the region of interest and the pixel value of the corresponding pixel in the reference image is higher than a preset difference.

[0011] The feature points corresponding to the region of interest in the first image feature are sampled to obtain the first defect feature;

[0012] Based on the first image features, the benchmark image features, and the first query information, the detection results of the wafer grains to be tested are generated through the trained language model, including:

[0013] Based on the first image features, the reference image features, the first query information, and the first defect features, the detection results of the wafer grains to be tested are generated through the trained language model.

[0014] In one possible implementation, the trained language model is obtained through the following steps:

[0015] Feature extraction is performed on the training images of the training wafer dies to obtain the second image features;

[0016] Based on the second image features, the baseline image features, and the second query information, a prediction result for the training wafer is generated using an untrained language model. The second query information is used to guide the generation of the prediction result, and the prediction result is used to describe the defective parts of the training wafer.

[0017] Based on the difference between the predicted results and the actual defect information of the training wafers, the model parameters of the language model that has not been fully trained are adjusted.

[0018] In one possible implementation, the defective portion of the training wafer corresponds to a defective region in the training image. Based on second image features, baseline image features, and second query information, a prediction result for the training wafer is generated using an untrained language model, including:

[0019] The feature points corresponding to the defect region in the second image feature are sampled to obtain the second defect feature;

[0020] Based on real defect information, determine the characteristics of defect description;

[0021] The second defect feature and the defect description feature are fused together to obtain the fused feature.

[0022] Based on the second image features, the baseline image features, the second query information, and the fused features, prediction results for training wafer granules are generated using an untrained language model.

[0023] In one possible implementation, the defective region is obtained through any of the following steps:

[0024] Using the defect pixel in the training image corresponding to the defective part of the training wafer as the center and a circle with a preset radius as the radius, the defective region is obtained.

[0025] The bounding rectangle of the defect pixels in the training image corresponding to the defective portion of the training wafer is determined as the defect region;

[0026] The defect pixels in the training image corresponding to the defective parts of the training wafer are directly used as pixels within the defective region.

[0027] In one possible implementation, the second query information is used to query the defect location of the defective portion of the first wafer grain in the training image, and the true defect information is used to describe that the first wafer grain does not have a defective portion, wherein the first wafer grain is a training wafer grain without a defective portion; and / or

[0028] The second query information is used to query whether the first position of the defective portion of the training wafer is correct in the training image. The actual defect information is used to describe whether the first position is incorrect, and the defective position of the defective portion of the training wafer in the training image. The first position is the position of the normal portion of the training wafer in the training image; and / or

[0029] The second query information is used to query the first method of the defective part of the training wafer. The real defect information is used to describe the first method as incorrect and the correct method of the defective part of the training wafer. The first method is the incorrect handling method for the defective part, and the correct method is the correct handling method for the defective part.

[0030] In one possible implementation, the second query information is used to query the defect type of the defective portion of the training wafer, and the actual defect information is used to describe the defect type of the defective portion of the training wafer; and / or

[0031] The second query information is used to query the location of the defective portion of the training wafer in the training image, and the true defect information is used to describe the location of the defective portion of the training wafer in the training image; and / or

[0032] The second query information is used to query the positional relationship between the defective portion of the training wafer and the functional region within the training wafer; the actual defect information is used to describe the positional relationship between the defective portion of the training wafer and the functional region within the training wafer; and / or

[0033] The second query information is used to query the visual characteristics of the defective parts of the training wafer, while the actual defect information is used to describe the visual characteristics of the defective parts of the training wafer.

[0034] In one possible implementation, the second query information is used to query defect information of the training wafer dies in a preset area, and the actual defect information is used to describe the defect information of the training wafer dies in the preset area; and / or

[0035] The second query information includes defect information. The second query information is used to query the position of the part of the training wafer that matches the defect information in the training image. The real defect information is used to describe the position of the part of the training wafer that matches the defect information in the training image.

[0036] In one possible implementation, the second query information is used to query the defect type and cause of the defective portion of the training wafer, and the actual defect information is used to describe the defect type and cause of the defective portion of the training wafer; and / or

[0037] The second query information is used to query the processing method for defective parts of the training wafer, and the actual defect information is used to describe the processing method for defective parts of the training wafer; and / or

[0038] The second query information is used to query wafer processing improvement suggestions for defective parts of training wafer dies, and the actual defect information is used to describe the wafer processing improvement suggestions for defective parts of training wafer dies.

[0039] According to a second aspect of this application, a defect detection device is also provided, the device comprising: a first image feature extraction module and a detection result generation module.

[0040] The first image feature extraction module is used to perform feature extraction operations on the image of the wafer grains to be tested in order to obtain the first image features;

[0041] The detection result generation module is used to generate the detection result of the wafer grain to be tested based on the first image features, the reference image features and the first query information, through a trained language model. The reference image features are the features obtained by feature extraction of the reference image. The reference wafer grain in the reference image does not have defective parts. The first query information is used to guide the generation of the detection result. The detection result is used to describe the defects of the wafer grain to be tested.

[0042] According to a third aspect of this application, an electronic device is also provided. The electronic device includes a processor and a memory. The memory stores computer program instructions, which, when executed by the processor, are used to perform the aforementioned defect detection method.

[0043] According to a fourth aspect of this application, a storage medium is also provided. Program instructions are stored on this storage medium, which, when executed, are used to perform the aforementioned defect detection method.

[0044] According to a fifth aspect of this application, a computer program product is also provided. This computer program product includes computer program instructions that, when executed, perform the aforementioned defect detection method.

[0045] According to the above-described scheme of the embodiments of this application, feature extraction can be performed on the image of the wafer to be tested to obtain first image features. Then, based on the first image features, the reference image features, and the first query information, the detection result of the wafer to be tested is generated through a trained language model. On the one hand, compared with the method of using predicted labels output by a deep learning model to represent the detection result, the detection result of the above scheme can be more flexible and is conducive to adapting to scenarios where there are complex and numerous defects in the wafer or wafer chip; on the other hand, the above scheme also uses the reference image features as one of the inputs of the language model, which allows the language model to focus on the differences between the wafer to be tested and the reference wafer chip, which is conducive to improving the accuracy of the detection result. Attached Figure Description

[0046] The above and other objects, features, and advantages of this application will become more apparent from the more detailed description of the embodiments of this application in conjunction with the accompanying drawings. The accompanying drawings are used to provide a further understanding of the embodiments of this application and form part of the specification. They are used together with the embodiments of this application to explain this application and do not constitute a limitation thereof. In the accompanying drawings, the same reference numerals generally represent the same components or steps.

[0047] Figure 1 shows a schematic flowchart of a defect detection method according to an embodiment of this application;

[0048] Figure 2 illustrates a schematic diagram of the training process of a language model according to an embodiment of this application;

[0049] Figure 3 illustrates a schematic diagram of the training process of a language model according to an embodiment of this application;

[0050] Figure 4 shows a schematic block diagram of a defect detection apparatus according to an embodiment of this application; and

[0051] Figure 5 shows a schematic block diagram of an electronic device according to an embodiment of the present application. Detailed Implementation

[0052] To make the objectives, technical solutions, and advantages of this application more apparent, exemplary embodiments according to this application will be described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of this application, and not all embodiments of this application. It should be understood that this application is not limited to the exemplary embodiments described herein. Based on the embodiments of this application described herein, all other embodiments obtained by those skilled in the art without inventive effort should fall within the protection scope of this application.

[0053] In the field of component manufacturing, packaging and testing defective industrial components (such as wafer dies) would be a waste of manpower and resources. Therefore, it is necessary to inspect industrial components to detect and reject unqualified ones. However, industrial components may have various defects. For example, wafers may have defects such as foreign objects or cracks. Similarly, wafer dies may have defects such as out-of-bounds pin marks, excessive pin marks, or foreign objects. Existing defect detection models can only output a limited number of defect types, making them unsuitable for real-world scenarios with complex and numerous defects in wafers or wafer dies.

[0054] To at least partially address the aforementioned problems, according to a first aspect of this application, embodiments of this application provide a method for detecting defects. Figure 1 shows a schematic flowchart of a defect detection method according to an embodiment of this application. As shown in Figure 1, the method may include the following steps S110 to S120.

[0055] In step S110, feature extraction is performed on the image of the wafer to be tested to obtain the first image features.

[0056] The image to be tested for the wafer die can be a static image or any frame from a dynamic video. The image to be tested for the wafer die can be a raw image captured by an image acquisition device (e.g., a raw image captured by an image sensor in a camera), or an image obtained after preprocessing the raw image (such as digitization, normalization, smoothing, etc.). In one example, a wafer image of the wafer to be tested can be acquired, and then the wafer image can be divided into images of multiple wafer dies to be tested. For example, the region of the wafer die to be tested in the wafer image can be determined by matching the wafer image with a reference wafer die image of a reference wafer die. Then, the region of the wafer die to be tested in the wafer image can be cropped to obtain the image to be tested for the wafer die.

[0057] Feature extraction algorithms or models can be used to extract features from the image of the wafer under test to obtain the first image features. For example, Haar-like feature extraction algorithms, feature extraction modules in convolutional neural networks, and feature extraction modules in recurrent neural networks can be used to extract features from the image under test to obtain the first image features. It should be understood that the image under test can also be input into an image encoder to obtain an image embedding, which can then be used as the first image feature. The aforementioned image encoder can also be pre-trained.

[0058] In step S120, based on the first image features, the reference image features, and the first query information, the detection result of the wafer grains to be tested is generated through the trained language model.

[0059] The reference image features are those obtained by feature extraction from the reference image. Feature extraction algorithms or models described above can be used to perform feature extraction on the reference image to obtain the reference image features. The reference wafer dies in the reference image may not have defective parts. For example, an image acquisition device can be used to capture an image of a reference wafer die without defective parts to obtain the reference image. Then, feature extraction algorithms or models can be used to perform feature extraction on the reference image to obtain the reference image features. For example, Haar-like feature extraction algorithms, feature extraction modules in convolutional neural networks, and feature extraction modules in recurrent neural networks can be used to perform feature extraction on the reference image to obtain the reference image features. It should be understood that the reference image can also be input into an image encoder to obtain an image embedding of the reference image, and this image embedding can be used as the reference image features.

[0060] The first query information guides the generation of the detection results, which describe the defective parts of the wafer under test. This first query information can be input by a user through an input device, such as a keyboard or microphone. The first query information can be directly input via the keyboard or by inputting a speech record via the microphone, which is then converted into the first query information. The first query information can be used to perform defect-related queries on the image under test, and the detection results can be used to describe the relevant content in the image under test based on the semantics of the first query information.

[0061] The language model in this application embodiment can be any language model with language question-answering capabilities. For example, the language model can be BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pre-trained Transformer), LLMMA (Large Language Model Meta AI), etc.

[0062] The aforementioned language model may include at least an encoder and a decoder. The encoder can be used to convert the first query information into a first query code. Then, the first query code, the first image features, and the baseline image features can be processed (e.g., feature concatenation, feature fusion, feature enhancement, etc.) and input together into the decoder to generate a detection result that conforms to natural language. In a practical scenario, the first query information could be "Does this image contain any abnormal regions on the wafer chips, and what is the type of abnormality?" The detection result could be "This image contains abnormal regions on the wafer chips, and the type of abnormal region is scratches."

[0063] In one example, other relevant information about the wafer die under test can also be determined and converted into an encoding. This encoding, along with the first query encoding, the first image features, and the reference image features, undergoes feature processing and is then input into the decoder to generate the aforementioned detection result. The specific content of the relevant information can be determined depending on the actual situation. For example, the relevant information may include: text descriptions related to the visual characteristics of the wafer die under test, text descriptions related to the processing stage of the wafer die under test, etc.

[0064] According to the above-described scheme of the embodiments of this application, feature extraction can be performed on the image of the wafer to be tested to obtain first image features. Then, based on the first image features, the reference image features, and the first query information, the detection result of the wafer to be tested is generated through a trained language model. On the one hand, compared with the method of using predicted labels output by a deep learning model to represent the detection result, the detection result of the above scheme can be more flexible and is conducive to adapting to scenarios where there are complex and numerous defects in the wafer or wafer chip; on the other hand, the above scheme also uses the reference image features as one of the inputs of the language model, which allows the language model to focus on the differences between the wafer to be tested and the reference wafer chip, which is conducive to improving the accuracy of the detection result.

[0065] In one possible implementation, the trained language model is obtained through the following steps S210 to S230.

[0066] In step S210, feature extraction is performed on the training image of the training wafer to obtain the second image features.

[0067] The training images for training wafer granules can be static images or any video frame from a dynamic video. The training images for training wafer granules can be raw images captured by an image acquisition device (e.g., raw images captured by an image sensor in a camera), or images obtained after preprocessing the raw images (such as digitization, normalization, smoothing, etc.). In one example, a wafer image for training wafer granules can be acquired, and then the wafer image can be divided into multiple images for training wafer granules. For example, the training wafer granule region in the wafer image can be determined by matching the wafer image with a reference wafer granule image of a reference wafer granule, and then the training wafer granule region in the wafer image can be cropped to obtain the training images for training wafer granules.

[0068] Feature extraction algorithms or models can be used to extract features from training images of training wafer dies to obtain second image features. For example, Haar-like feature extraction algorithms, feature extraction modules in convolutional neural networks, and feature extraction modules in recurrent neural networks can be used to extract features from training images to obtain second image features. It should be understood that training images can also be input into an image encoder to obtain image embeddings of the training images, and these image embeddings can be used as second image features.

[0069] In step S220, based on the second image features, the reference image features, and the second query information, prediction results for training wafer granules are generated using an untrained language model.

[0070] The second query information guides the generation of prediction results, which describe the defective parts of the training wafer dies. This second query information can be input by a user via an input device, such as a keyboard or microphone. The second query information can be directly input via the keyboard or by inputting speech via the microphone, which is then converted into the second query information. The second query information can be used to perform defect-related queries on the training images, and the detection results can be used to semantically describe the relevant content in the training images.

[0071] The aforementioned untrained language model may include at least an encoder and a decoder. It should be understood that developers can also adjust the structure of the language model according to actual needs. For example, a prompt learning module can be added to the language model to enable prompt learning.

[0072] The second query information can be converted into a second query code (which can be generated using existing encoding algorithms or other encoders, or from an encoder in an untrained language model). Then, the second image features, the baseline image features, and the second query code can be processed (e.g., feature concatenation, feature fusion, feature enhancement, etc.) and input together into the decoder to generate a prediction result that conforms to natural language and is used for training.

[0073] In one example, other relevant information about the training wafer can also be determined and converted into an encoding. This encoding, along with the second image features, the baseline image features, and the second query encoding, undergoes feature processing and is then input into the decoder to generate the aforementioned prediction result. The specific content of the relevant information can be determined depending on the actual situation. For example, the relevant information may include: textual descriptions related to the visual characteristics of the training wafer, textual descriptions related to the processing stage of the training wafer, etc.

[0074] In step S230, the model parameters of the language model that has not been fully trained are adjusted based on the difference between the prediction results and the actual defect information of the training wafer.

[0075] The actual defect information of training wafer dies can represent the semantics of actual defects present in the training wafer dies. For example, if there are scratches in the pad area of the training wafer die, the actual defect information of the training wafer die can be the semantics indicating that there are scratches in the pad area of the training wafer die. As another example, if there are transparent foreign objects in the surface area of the training wafer die, the actual defect information of the training wafer die can be the semantics indicating that there are transparent foreign objects in the surface area of the training wafer die.

[0076] In one example, the difference between the prediction result and the actual defect information of the training wafer can be represented by mean square error, cross-entropy, etc., or the above difference can be calculated based on the loss function in related technologies. This application embodiment does not limit this.

[0077] The difference between the predicted result of the training wafer and the actual defect information of the training wafer can be used as part of the loss value of the untrained language model. The training objective of the language model can include minimizing the loss value. Therefore, the language model trained by the above scheme can continuously reduce the difference between the predicted result and the actual defect information, thereby making the predicted result of the iterative output increasingly closer to the relevant description of the actual defect part of the training wafer. It should be understood that the difference between the predicted result and the actual defect information of the training wafer can be used as part of the loss value, which can also include the loss value obtained from other loss functions in related technologies. This application embodiment does not impose any limitations on this. Gradient descent, backpropagation, and other methods can be used to adjust the model parameters of the untrained language model to obtain the trained language model. In one example, supervised fine-tuning, reinforcement learning with human feedback, and other training methods can also be combined to train the untrained language model.

[0078] According to the above-described scheme of this application embodiment, feature extraction can be performed on the training image of the training wafer to obtain second image features. Then, based on the second image features, the reference image features, and the second query information, a prediction result for the training wafer can be generated using an incompletely trained language model. Finally, based on the difference between the prediction result and the actual defect information of the training wafer, the model parameters of the incompletely trained language model are adjusted. In the above scheme, the incompletely trained language model can continuously adjust its own model parameters based on the difference between the prediction result and the actual defect information. The trained language model obtained accordingly can generate a more accurate detection result for the wafer to be tested based on the first image features of the image to be tested, the reference image features, and the first query information.

[0079] In one possible implementation, the method for detecting the above-mentioned defects may further include steps S310 to S320.

[0080] In step S310, the reference image is compared with the image to be tested to determine the region of interest in the image to be tested.

[0081] Pixels in the image under test can be compared with their corresponding pixels in a reference image to determine regions of interest in the image under test. Specifically, at least one pixel within a region of interest must have a pixel value greater than a preset difference compared to its corresponding pixel value in the reference image. For example, the region of interest can be determined by calculating the absolute value of the difference between the pixel value in the reference image and its corresponding pixel value. Specifically, based on the positional information of each pixel in the image under test and the pixel value in the reference image, the corresponding pixels in the reference image are determined. The difference between the pixel value of each pixel in the image under test and its corresponding pixel value in the reference image is calculated to obtain the pixel difference value for each pixel in the image under test. Pixels in the image under test whose absolute pixel difference value is greater than a preset difference are clustered to obtain at least one clustered region. In one example, this at least one clustered region can be considered the region of interest in the image under test. In another example, clustered regions within this at least one clustered region that meet preset conditions can be considered the region of interest in the image under test. The preset difference and preset conditions can be determined based on actual circumstances. For example, preset conditions could include the area of the clustered region being greater than a preset area value, or the clustered region being located in a region of interest (e.g., pad area, adjustment area) in the image to be tested. This disclosure does not limit this. Specifically, for example, the image to be tested includes pixels P1, P2, and P... n Where n is the total number of pixels in the image to be tested. The pixel value of each pixel is subtracted from the pixel value of its corresponding position in the reference image. If the pixel value of pixel P1 in the reference image is P1', then the pixel value of pixel P1 is subtracted from the pixel value of pixel P1' to obtain the pixel difference value X1 corresponding to pixel P1. This process is repeated to obtain the pixel differences from pixel P1 to pixel P... n The corresponding pixel differences X1 to X1 respectively n Traverse pixel difference X1 to pixel difference X n Determine the pixel difference X1 to pixel difference X in the image to be tested. n The absolute value of the pixel difference is greater than the pixel difference corresponding to the preset difference. In the pixel difference X... 10 To pixel difference X 20 and pixel difference X 40 To pixel difference X 50 If all differences are greater than the preset difference, the pixels P in the image to be measured that correspond to the above pixel differences can be... 10 To pixel P 20 And pixel P 40 To pixel P 50Clustering is performed separately to obtain cluster regions Q1 and Q2. If the preset condition is that the area of a cluster region is greater than a preset area value, then if the area of cluster region Q1 is greater than the preset area value, then cluster region Q1 is taken as the region of interest in the image to be tested.

[0082] In step S320, the feature points corresponding to the region of interest in the first image feature are sampled to obtain the first defect feature.

[0083] The regions of interest in the image under test can be marked as 1, and the regions of non-interest as 0, to generate a binary image of the image under test. Then, based on the binary image of the image under test, the feature points corresponding to the regions of interest in the first image feature are determined. It should be understood that the first image feature is obtained based on feature extraction from the image under test; therefore, there may be a correspondence between the feature points in the first image feature and the pixels in the image under test. In practical scenarios, if the total number of feature points in the first image feature is less than the total number of pixels in the image under test, then each feature point in the first image feature can correspond to multiple pixels in the image under test. A sampling operation can be performed on the feature points in the image under test corresponding to the regions of interest, and the first defect feature is obtained based on the sampling results. For example, a sampling operation can be performed on the feature points corresponding to the regions of interest in the first image feature. The average value or weighted average value of the sampled feature points in the first image feature is calculated, and the result is used as the aforementioned first defect feature. Another example is that a sampling operation can be performed on the feature points corresponding to the regions of interest in the first image feature. Feature stitching or feature fusion operations can be performed on the sampled feature points, and the stitching or fusion result is used as the aforementioned first defect feature. In one possible implementation, feature points corresponding to the region of interest in the image under test can be sampled using random sampling, uniform sampling, max pooling, average pooling, importance-based sampling, and clustering sampling, etc. This application does not impose any limitations on these methods. In one example, the first defect feature can be a feature with a preset feature size, so that the trained language model focuses on the feature distribution of the first defect feature, rather than on the influence of different feature sizes of the first defect feature on the feature distribution.

[0084] In this example, step S120 above generates the detection result of the wafer grain to be tested based on the first image features, the reference image features and the first query information through the trained language model, and may include step S121.

[0085] In step S121, based on the first image features, the reference image features, the first query information, and the first defect features, the detection result of the wafer grains to be tested is generated through the trained language model.

[0086] The trained language model may include at least an encoder and a decoder. The encoder can be used to convert the first query information into a first query code. Then, the first query code, the first image features, the reference image features, and the first defect features can be processed (e.g., feature concatenation, feature fusion, feature enhancement, etc.) and input together into the decoder to generate a detection result for the wafer grains under test that conforms to natural language.

[0087] In one example, other relevant information about the wafer die under test can also be determined and converted into an encoding. This encoding, along with the first query encoding, first image features, reference image features, and first defect features, undergoes feature processing (e.g., feature concatenation, feature fusion, feature enhancement, etc.) before being input into the decoder to generate the aforementioned detection result. The specific content of the relevant information can be determined depending on the actual situation. For example, the relevant information may include: textual descriptions related to the visual characteristics of the wafer die under test, textual descriptions related to the processing stage of the wafer die under test, etc.

[0088] According to the above-described scheme of the embodiments of this application, a reference image and a test image can be compared to determine the region of interest in the test image. Then, the feature points corresponding to the region of interest in the first image features are sampled to obtain a first defect feature. Finally, based on the first image features, the reference image features, the first query information, and the first defect feature, a detection result for the wafer grain under test is generated using a trained language model. On the one hand, the above scheme can obtain the first defect feature by sampling the feature points corresponding to the region of interest in the first image features, reducing the amount of data that needs to be processed subsequently. On the other hand, the above scheme can integrate the first image features, the reference image features, the first query information, and the first defect feature, and generate the detection result for the wafer grain under test based on the trained language model, thereby improving the degree of attention the trained language model pays to the region of interest. Since this region of interest is usually related to anomalies in the wafer grain, the above scheme can improve the accuracy of the detection result.

[0089] In one possible implementation, the defective portion of the training wafer grains corresponds to a defective region in the training image.

[0090] The defective region can be determined through the relevant steps in step S310 above, or obtained through manual annotation, or generated based on the training image by a defective region detection model in related technologies. This embodiment of the application does not impose any limitations on this. The aforementioned defective region can be represented by the coordinates of its vertices in the training image.

[0091] In this example, step S220, based on the second image features, the baseline image features, and the second query information, generates the prediction results for the training wafer granules using an untrained language model, and may include steps S221 to S224.

[0092] In step S221, the feature points corresponding to the defect region in the second image feature are sampled to obtain the second defect feature.

[0093] The second image feature is obtained by feature extraction based on the image to be tested. Therefore, the feature points in the second image feature can correspond to the pixels in the training image. In practical scenarios, if the total number of feature points in the second image feature is less than the total number of pixels in the training image, then each feature point in the second image feature can correspond to multiple pixels in the training image.

[0094] In one possible implementation, feature points corresponding to defect regions in the training image can be sampled using random sampling, uniform sampling, max pooling, average pooling, importance-based sampling, and clustering sampling to obtain the second defect feature. It should be understood that the sampling operation in this step can be the same as the sampling operation used in step S320 above. In one example, the second defect feature can be a feature with a preset feature size mentioned above, so that the trained language model focuses on the feature distribution of the second defect feature, rather than on the influence of different feature sizes of the second defect feature on the feature distribution.

[0095] In step S222, the defect description features are determined based on the actual defect information.

[0096] In one example, ground-level defect information can be converted into defect description features using an encoder in an untrained language model. In another example, a frozen encoder from another model can be directly used as the encoder in the language model to convert ground-level defect information into defect description features. In this case, iterative tuning of the encoder parameters is not required. In yet another example, feature extraction algorithms from related technologies can be used to extract features from the ground-level defect information to obtain defect description features.

[0097] In step S223, the second defect feature and the defect description feature are fused together to obtain the fused feature.

[0098] The aforementioned feature fusion operations may include splicing, average summation, dot product, etc., and the embodiments of this application are not limited to these.

[0099] In step S224, based on the second image features, the reference image features, the second query information, and the fusion features, the prediction results for the training wafer granules are generated using the language model that has not been fully trained.

[0100] An untrained language model may include at least an encoder and a decoder. The encoder can be used to convert the second query information into a second query code. Then, the second query code, second image features, baseline image features, and fused features can be processed (e.g., feature concatenation, feature fusion, feature enhancement, etc.) and input together into the decoder to generate a prediction result that conforms to natural language and is used for training.

[0101] In one example, other relevant information about the training wafer can also be determined and converted into an encoding. This encoding, along with the second query encoding, second image features, baseline image features, and fused features, undergoes feature processing (e.g., feature concatenation, feature fusion, feature enhancement, etc.) before being input into the decoder to generate the aforementioned prediction result. The specific content of the relevant information can be determined depending on the actual situation. For example, the relevant information may include: textual descriptions related to the visual characteristics of the training wafer, textual descriptions related to the processing stage of the training wafer, etc.

[0102] Referring to Figures 2 and 3, which illustrate a schematic diagram of the training process of a language model according to an embodiment of this application. In conjunction with Figures 2 and 3, a reference image is subjected to feature extraction using an image encoder (the first image encoder in Figure 3) to obtain reference image features. The training image is subjected to feature extraction using an image encoder (the second image encoder in Figure 3) (the reference image and training image may use the same or different image encoders) to obtain second image features. Second query information is input to a text encoder to obtain a second query code. The second image features are sampled through the process described in step S221 above (obtained by combining the defect regions of the training wafer dies with a region image encoder), and fused features are obtained through the process described in step S223 above (refer to the connection between the region image encoder and the text encoder). Then, the second query code, second image features, reference image features, and fused features are subjected to feature fusion (refer to the feature fusion module in Figure 3), and the features obtained from the feature fusion operation are input into the untrained language model to obtain a prediction result.

[0103] According to the above-described scheme of the embodiments of this application, feature points corresponding to the defect region in the second image features can be sampled to obtain the second defect feature. Then, defect description features can be determined based on the real defect information. Next, the second defect feature and the defect description feature are fused to obtain the fused feature. Finally, based on the second image feature, the reference image feature, the second query information, and the fused feature, a prediction result for the training wafer granules can be generated using an untrained language model. On the one hand, the above scheme can obtain the second defect feature by sampling feature points corresponding to the defect region in the second image features, reducing the amount of data that needs to be processed subsequently; on the other hand, the above scheme can integrate the second image feature, the reference image feature, the second query information, and the fused feature to generate a prediction result for the training wafer granules based on an untrained language model. Then, by observing the difference between the prediction result and the real defect information, the model parameters can be continuously adjusted, which can improve the attention of the trained language model to the defect region and improve the accuracy of the detection results of the wafer granules to be tested.

[0104] In one possible implementation, the defect region can be obtained by the following steps: taking the defect pixel corresponding to the defect portion of the training wafer in the training image as the center and drawing a circle with a preset radius as the radius, so as to obtain the defect region.

[0105] Pixels whose pixel values differ from a baseline image by a preset difference value can be identified as defective pixels. In one example, a circle can be drawn with a preset radius centered on a single defective pixel, and this circular area can be considered the defective region. In practical scenarios, defective pixels are usually concentrated, so using a preset radius for the circle can cover most defective pixels, and this implementation requires less computational power. In another example, for each defective pixel, a circle is drawn with that pixel as the center and a preset radius as the radius, and the union of the circular areas corresponding to each defective pixel is considered the defective region. In practical scenarios, this method ensures that the defective region includes all the compared defective pixels, thus improving the representativeness of the defective region. Furthermore, since defective pixel generation may result in omissions, and multiple defective pixels are usually concentrated, the circular area obtained by drawing circles from defective pixels can, to some extent, include the defective pixels missed in the comparison mentioned above, further improving the representativeness of the defective region.

[0106] In one possible implementation, the defect region can also be obtained by the following steps: determining the bounding rectangle of the defect pixels corresponding to the defect portion of the training wafer in the training image, as the defect region.

[0107] Pixels whose pixel values differ from a baseline image by a preset difference value can be identified as defective pixels. In practical scenarios, this method ensures that the defective region includes all compared defective pixels, thus improving the representativeness of the defective region. Furthermore, since defective pixel generation may result in omissions, and multiple defective pixels are often clustered together, using the bounding rectangle of all compared defective pixels as the defective region can, to some extent, include the defective pixels missed during the comparison mentioned above, further enhancing the representativeness of the defective region.

[0108] In one possible implementation, the defect region can also be obtained by the following steps: directly using the defect pixels corresponding to the defective portion of the training wafer in the training image as pixels within the defect region.

[0109] Pixels with pixel values greater than a preset difference value can be identified as defective pixels by comparing them with a reference image. These defective pixels can then be integrated to obtain the defective region. This method requires less computing power, which helps improve the efficiency of generating detection results for the wafer under test.

[0110] It should be understood that any of the different methods for obtaining the defective area described above can be used in any one of them or in combination, and the embodiments of this application are not limited herein.

[0111] According to the above-described scheme provided in the embodiments of this application, defect regions can be generated in multiple ways to improve the flexibility of defect region generation, thereby adapting to different types of defects in wafer granules. This can improve the language model's ability to recognize different types of defects in training wafer granules, which is beneficial to improving the accuracy of the detection results of the wafer granules under test.

[0112] In one possible implementation, the second query information can be used to query the defect type of the defective portion of the training wafer, and the real defect information is used to describe the defect type of the defective portion of the training wafer.

[0113] In a real-world scenario, the second query could be "What defects exist in this wafer grain?", and the actual defect information could be "There are cracks in the wafer grain." By using the second query and the actual defect information, the detection results output by the trained language model can be used to represent the defect type of the defective portion of the wafer grain under test.

[0114] In one possible implementation, the second query information can be used to query the defect location of the defective portion of the training wafer in the training image, and the real defect information can be used to describe the defect location of the defective portion of the training wafer in the training image.

[0115] The defect location can be represented by coordinates or by generalized text, such as top left corner or center. In a practical scenario, the second query could be "Where is the defect located on this wafer chip?", and the actual defect information could be "The defective part of this wafer chip is located in the top left corner of the image." In this example, the defect location is indeed the top left corner. By using the above second query information and actual defect information, the detection results output by the trained language model can be used to represent the defect location of the defective part of the wafer chip in the image being tested.

[0116] In one possible implementation, the second query information can be used to query the positional relationship between the defective portion of the training wafer and the functional region in the training wafer, and the real defect information can be used to describe the positional relationship between the defective portion of the training wafer and the functional region in the training wafer.

[0117] The aforementioned functional areas can be regions where wafer dies perform specific logical functions, such as pad areas, trimming areas, and surface areas. In practical scenarios, the second query information could be "whether the defective portion of the wafer die is located in the pad area," and the actual defect information could be "the defective portion of the wafer die is located in the pad area." For example, the second query information could be "where is the defective portion of the wafer die relative to the pad area," and the actual defect information could be "the defective portion is located above the pad area." By using the above second query information and actual defect information, the detection results output by the trained language model can be used to represent the positional relationship between the defective portion of the wafer die under test and the functional areas of the wafer die under test.

[0118] In one possible implementation, the second query information can be used to query the visual characteristics of the defective portions of the training wafer, and the actual defect information can be used to describe the visual characteristics of the defective portions of the training wafer.

[0119] In a real-world scenario, the second query could be "What are the visual characteristics of the defective portion in the wafer grains?", and the actual defect information could be "The defective portion in the wafer grains is a strip-shaped defect." By using the second query and the actual defect information, the detection results output by the trained language model can be used to represent the visual characteristics of the defective portion of the wafer grains under test.

[0120] It should be understood that the above examples of different examples of second query information and real defect information can be used in any way or in combination, and the embodiments of this application are not limited herein.

[0121] According to the above-described scheme of the embodiments of this application, the language model that has not been trained can be trained on the second query information and the real defect labels corresponding to the second query information on different query dimensions, so that the trained language model can generate detection results for the first query information on different query dimensions, which is beneficial to adapt to different application scenarios.

[0122] In one possible implementation, the second query information is used to query the defect information of the training wafer in a preset area, and the real defect information is used to describe the defect information of the training wafer in the preset area.

[0123] The aforementioned preset region can be specific coordinates or a generalized text description, such as the top left corner or center of an image. It should be understood that the preset region can differ in each second query. Defect information can be used to represent defect-related information, such as whether a defect exists or the type of defect. In a practical scenario, the second query could be "whether a defect exists in the top left corner," and the actual defect information could be "a defect exists in the top left corner." By using the above second query information and actual defect information to train an untrained language model, the detection capability of the trained language model for the preset region can be improved, which is beneficial to improving the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output defect information in the preset region of the image under test when the first query includes the preset region.

[0124] In one possible implementation, the second query information includes defect information. This second query information is used to query the location of the portion of the training wafer that matches the defect information in the training image. The actual defect information describes the location of the portion of the training wafer that matches the defect information in the training image. In a practical scenario, the second query information could be "Where is the crack located?", and the actual defect information could be "The crack is located in the upper left corner." In this example, the location of the portion matching the defect information in the training image is the upper left corner. By using the above second query information and actual defect information to train an untrained language model, the ability of the trained language model to query the location of the defect information in the image can be improved, which is beneficial to improving the accuracy of the detection results for the wafer under test. Considering the detection scenario of the wafer under test, the trained language model can output the location of the portion matching the defect information in the image under test when the first query information includes defect information.

[0125] It should be understood that the above examples of different examples of second query information and real defect information can be used in any way or in combination, and the embodiments of this application are not limited herein.

[0126] According to the above-described scheme of the embodiments of this application, the language model that has not been trained can be trained on the second query information and the real defect labels corresponding to the second query information on different query dimensions, so that the trained language model can generate detection results for the first query information on different query dimensions, which is beneficial to adapt to different application scenarios.

[0127] In one possible implementation, the second query information is used to query the defect type and cause of the defective portion of the training wafer, and the real defect information is used to describe the defect type and cause of the defective portion of the training wafer.

[0128] In a practical scenario, the second query could be "What is the type of defect in the defective part, and how was it generated?", while the actual defect information could be "This defect type is a foreign object, possibly caused by dust." By using this second query and the actual defect information to train the untrained language model, the ability of the trained language model to query the causes of defects can be improved, thus enhancing the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output the defect type and its cause when the first query is used to query the cause of the defect.

[0129] In one possible implementation, the second query information is used to query the processing method of the defective portion of the training wafer, and the real defect information is used to describe the processing method of the defective portion of the training wafer.

[0130] The above processing method can be used to indicate how to handle the defective part in order to repair it. In a practical scenario, the second query information could be "how to remove the foreign object," and the actual defect information could be "the foreign object can be removed by dry cleaning." By using the above second query information and actual defect information to train the untrained language model, the ability of the trained language model to query the processing method for the defective part can be improved, which is beneficial to improving the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output the processing method for the defective part when the first query information is used to query the processing method for the defective part, thus guiding the user to repair the defective part.

[0131] In one possible implementation, the second query information is used to query wafer processing improvement suggestions for defective portions of the training wafer, and the actual defect information is used to describe the wafer processing improvement suggestions for defective portions of the training wafer.

[0132] In a practical scenario, the second query could be "How to improve the wafer die processing flow to avoid foreign objects in the wafers," and the actual defect information could be "Maintaining cleanliness in the workshop during wafer die processing, or regularly cleaning the platform holding the wafers." By using the second query and the actual defect information to train the incomplete language model, the trained language model's ability to query suggestions for improving the wafer die processing flow can be improved, thus enhancing the accuracy of the detection results for the wafers under test. Combined with the detection scenario of the wafers under test, the trained language model can output suggestions for improving the wafer die processing flow when the first query is used to query for such suggestions, guiding users to improve the wafer die processing flow and preventing more wafers from having defects.

[0133] It should be understood that the above examples of different examples of second query information and real defect information can be used in any way or in combination, and the embodiments of this application are not limited herein.

[0134] According to the above-described scheme of the embodiments of this application, the language model that has not been trained can be trained on the second query information and the real defect labels corresponding to the second query information on different query dimensions, so that the trained language model can generate detection results for the first query information on different query dimensions, which is beneficial to adapt to different application scenarios.

[0135] In one possible implementation, the second query information can be used to query the defect location of the defect portion of the first wafer in the training image, and the real defect information is used to describe that the first wafer does not have a defect portion.

[0136] The first wafer is a training wafer without any defects. The defect location can be a specific coordinate or a generalized text description, such as the top left corner or center of an image. In the context of the real-world scenario, the second query could be "Where is the crack located in the image?" Since the first wafer lacks defects, the actual defect information could be "This wafer does not have any cracks or other defects." Because the second query implicitly suggests the presence of a crack in the wafer, it may affect the trained language model's judgment of a wafer that actually lacks defects. By using the second query and the actual defect information, the trained language model's ability to independently detect wafer defects can be improved, thus increasing the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output natural language text indicating that the wafer under test is defect-free, even when the wafer under test is defect-free, without being negatively affected by the first query.

[0137] In one possible implementation, the second query information is used to query whether the first position of the defective portion of the training wafer is correct in the training image, and the real defect information is used to describe whether the first position is incorrect and the defective position of the defective portion of the training wafer in the training image.

[0138] The first position is the location of the normal portion of the training wafer in the training image. Considering the real-world scenario, the second query could be "Is the foreign object in the wafer located in the upper left corner?", and the actual defect information could be "The foreign object in the wafer is not located in the upper left corner, but in the lower right corner." In this example, the first position is "lower right corner." Since the second query implicitly assumes the defect location is in the upper left corner, it may affect the trained language model's misjudgment of the defect location. By using the second query and the actual defect information to train the incomplete language model, the independent detection capability of the trained language model for wafer defects can be improved, thus increasing the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output the correct location of the defect in the image under test, even if the first query includes the incorrect location of the defect.

[0139] In one possible implementation, the second query information can be used to query a first mode of defective portions of the training wafer, the true defect information being used to describe whether the first mode is incorrect, and the correct mode of defective portions of the training wafer.

[0140] The first approach is the incorrect handling method for the defect, and the correct approach is the correct handling method for the defect. In a real-world scenario, the second query could be "Can foreign objects in the wafer be removed by soaking in chemical cleaning solution?", and the actual defect information could be "The above method is incorrect; dry cleaning should be used to remove them." In this example, the first approach is soaking in chemical cleaning solution, and the correct approach is dry cleaning. Because the second query implicitly suggests using chemical cleaning solution to remove foreign objects, it may affect the trained language model's misjudgment of the defect handling method. By using the second query and the actual defect information to train an incomplete language model, the trained language model's ability to independently select the appropriate handling method for wafer defects can be improved, thus enhancing the accuracy of the detection results for the wafer under test. Combined with the detection scenario of the wafer under test, the trained language model can output the correct handling method for defects in the wafer under test, even when the first query includes the incorrect handling method for defects.

[0141] It should be understood that the above examples of different examples of second query information and real defect information can be used in any way or in combination, and the embodiments of this application are not limited herein.

[0142] According to the above-described scheme of the embodiments of this application, the language model that has not been trained can be trained on the second query information and the real defect labels corresponding to the second query information on different query dimensions, so that the trained language model can generate detection results for the first query information on different query dimensions, which is beneficial to adapt to different application scenarios.

[0143] According to a second aspect of this application, embodiments of this application also provide a defect detection apparatus. FIG4 shows a schematic block diagram of a defect detection apparatus 400 according to an embodiment of this application. As shown in FIG4, the generating apparatus 400 may include a first image feature extraction module 410 and a detection result generation module 420.

[0144] The first image feature extraction module 410 is used to perform feature extraction operations on the image of the wafer to be tested to obtain first image features. The detection result generation module 420 is used to generate the detection result of the wafer to be tested based on the first image features, the reference image features, and the first query information, through a trained language model. The reference image features are the features obtained by feature extraction operations on the reference image. The reference wafer in the reference image does not have defective parts. The first query information is used to guide the generation of the detection result, and the detection result is used to describe the defects of the wafer to be tested.

[0145] In one possible implementation, the detection device 400 further includes: a region of interest determination module and a first defect feature determination module.

[0146] The region of interest determination module compares the reference image with the image to be tested to determine the region of interest in the image to be tested. The difference between the pixel value of at least one pixel located in the region of interest and the pixel value of its corresponding pixel in the reference image is higher than a preset difference. The first defect feature determination module samples the feature points corresponding to the region of interest in the first image features to obtain the first defect feature.

[0147] In this example, the detection result generation module 420 may include a first detection result generation module.

[0148] The first detection result generation module is used to generate the detection result of the wafer grains to be tested based on the first image features, the reference image features, the first query information, and the first defect features, through a trained language model.

[0149] In one possible implementation, the detection device 400 further includes: a second image feature extraction module, a first prediction result generation module, and a parameter adjustment module.

[0150] The second image feature extraction module performs feature extraction on the training images of the training wafer dies to obtain second image features. The first prediction result generation module generates prediction results for the training wafer dies based on the second image features, the baseline image features, and the second query information, using an incompletely trained language model. The second query information guides the generation of the prediction results, which describe the defects in the training wafer dies. The parameter adjustment module adjusts the model parameters of the incompletely trained language model based on the difference between the prediction results and the actual defect information of the training wafer dies.

[0151] In one possible implementation, the defective portion of the training wafer corresponds to a defective region in the training image, and the first prediction result generation module may include: a second defect feature generation module, a defect description feature generation module, a fusion feature generation module, and a second prediction result generation module.

[0152] The second defect feature generation module samples the feature points corresponding to the defect region in the second image features to obtain the second defect feature. The defect description feature generation module determines the defect description feature based on the real defect information. The fusion feature generation module performs feature fusion on the second defect feature and the defect description feature to obtain the fused feature. The second prediction result generation module generates the prediction result for the training wafer granules based on the second image features, the baseline image features, the second query information, and the fused feature, using an untrained language model.

[0153] In one possible implementation, the detection device 400 further includes at least one of the following: a first region generation module, a second region generation module, and a third region generation module.

[0154] The first region generation module is used to draw a circle with the defect pixel corresponding to the defect portion of the training wafer in the training image as the center and a preset radius as the radius, to obtain the defect region. The second region generation module is used to determine the bounding rectangle of the defect pixel corresponding to the defect portion of the training wafer in the training image, to be used as the defect region. The third region generation module is used to directly use the defect pixel corresponding to the defect portion of the training wafer in the training image as the pixel within the defect region.

[0155] In one possible implementation, the second query information is used to query the defect location of the defective portion of the first wafer grain in the training image, and the true defect information is used to describe that the first wafer grain does not have a defective portion, wherein the first wafer grain is a training wafer grain without a defective portion; and / or

[0156] The second query information is used to query whether the first position of the defective portion of the training wafer is correct in the training image. The actual defect information is used to describe whether the first position is incorrect, and the defective position of the defective portion of the training wafer in the training image. The first position is the position of the normal portion of the training wafer in the training image; and / or

[0157] The second query information is used to query the first method of the defective part of the training wafer. The real defect information is used to describe the first method as incorrect and the correct method of the defective part of the training wafer. The first method is the incorrect handling method for the defective part, and the correct method is the correct handling method for the defective part.

[0158] In one possible implementation, the second query information is used to query the defect type of the defective portion of the training wafer, and the actual defect information is used to describe the defect type of the defective portion of the training wafer; and / or

[0159] The second query information is used to query the location of the defective portion of the training wafer in the training image, and the true defect information is used to describe the location of the defective portion of the training wafer in the training image; and / or

[0160] The second query information is used to query the positional relationship between the defective portion of the training wafer and the functional region within the training wafer; the actual defect information is used to describe the positional relationship between the defective portion of the training wafer and the functional region within the training wafer; and / or

[0161] The second query information is used to query the visual characteristics of the defective parts of the training wafer, while the actual defect information is used to describe the visual characteristics of the defective parts of the training wafer.

[0162] In one possible implementation, the second query information is used to query defect information of the training wafer dies in a preset area, and the actual defect information is used to describe the defect information of the training wafer dies in the preset area; and / or

[0163] The second query information includes defect information. The second query information is used to query the position of the part of the training wafer that matches the defect information in the training image. The real defect information is used to describe the position of the part of the training wafer that matches the defect information in the training image.

[0164] In one possible implementation, the second query information is used to query the defect type and cause of the defective portion of the training wafer, and the actual defect information is used to describe the defect type and cause of the defective portion of the training wafer; and / or

[0165] The second query information is used to query the processing method for defective parts of the training wafer, and the actual defect information is used to describe the processing method for defective parts of the training wafer; and / or

[0166] The second query information is used to query wafer processing improvement suggestions for defective parts of training wafer dies, and the actual defect information is used to describe the wafer processing improvement suggestions for defective parts of training wafer dies.

[0167] According to a third aspect of this application, an electronic device is also provided. FIG5 shows a schematic block diagram of an electronic device 500 according to an embodiment of this application. As shown in FIG5, the electronic device 500 includes a processor 510 and a memory 520, wherein the memory 520 stores a computer program, and the computer program instructions are executed by the processor 510 to perform the aforementioned defect detection method.

[0168] Furthermore, according to a fourth aspect of this application, a storage medium is provided, on which program instructions are stored. When the program instructions are executed by a computer or processor, the computer or processor performs corresponding steps of the defect detection method described in the embodiments of this application, and is used to implement corresponding modules in the defect detection apparatus or electronic device described in the embodiments of this application. The storage medium may, for example, include a memory card of a smartphone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disc read-only memory (CD-ROM), a USB memory, or any combination of the above storage media. The computer-readable storage medium may be any combination of one or more computer-readable storage media. According to a fifth aspect of this application, a computer program product is also provided, including computer program instructions. When the computer program instructions are executed by a computer or processor, the computer or processor performs corresponding steps of the defect detection method described in the embodiments of this application.

[0169] Those skilled in the art can understand the specific implementation schemes of the above-mentioned electronic devices and storage media by reading the relevant descriptions of the defect detection methods. For the sake of brevity, they will not be described in detail here.

[0170] Although exemplary embodiments have been described herein with reference to the accompanying drawings, it should be understood that the above exemplary embodiments are merely illustrative and are not intended to limit the scope of this application. Various changes and modifications can be made therein by those skilled in the art without departing from the scope and spirit of this application. All such changes and modifications are intended to be included within the scope of this application as claimed in the appended claims.

[0171] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0172] In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed.

[0173] Numerous specific details are set forth in the specification provided herein. However, it will be understood that embodiments of this application may be practiced without these specific details. In some examples, well-known methods, structures, and techniques have not been shown in detail so as not to obscure the understanding of this specification.

[0174] Similarly, it should be understood that, in order to streamline this application and aid in understanding one or more of the various inventive aspects, features of this application may sometimes be grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of this application. However, this approach should not be construed as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as reflected in the corresponding claims, its inventive point lies in solving the corresponding technical problem with features fewer than all features of a single disclosed embodiment. Therefore, the claims following the detailed description are hereby expressly incorporated into that detailed description, wherein each claim itself is a separate embodiment of this application.

[0175] Those skilled in the art will understand that, apart from the mutual exclusion of features, all features disclosed in this specification (including the accompanying claims, abstract, and drawings) and all processes or units of any method or apparatus so disclosed can be combined in any combination. Unless otherwise expressly stated, each feature disclosed in this specification (including the accompanying claims, abstract, and drawings) may be replaced by an alternative feature that serves the same, equivalent, or similar purpose.

[0176] Furthermore, those skilled in the art will understand that although some embodiments described herein include certain features but not others included in other embodiments, combinations of features from different embodiments are intended to be within the scope of this application and form different embodiments. For example, in the claims, any one of the claimed embodiments can be used in any combination.

[0177] The various component embodiments of this application can be implemented in hardware, or as software modules running on one or more processors, or a combination thereof. Those skilled in the art will understand that microprocessors or digital signal processors (DSPs) can be used in practice to implement some or all of the functions of some modules in the defect detection apparatus according to embodiments of this application. This application can also be implemented as an apparatus program (e.g., a computer program and computer program product) for performing part or all of the methods described herein. Such an implementation of this application can be stored on a computer-readable medium, or can be in the form of one or more signals. Such signals can be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.

[0178] It should be noted that the above embodiments are illustrative of this application and not restrictive, and that those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. This application can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by the same item of hardware. The use of the words first, second, and third, etc., does not indicate any order. These words can be interpreted as names.

[0179] The above description is merely a specific embodiment or illustration of the embodiments of this application. The scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. The scope of protection of this application shall be determined by the scope of the claims.

Claims

1. A method for detecting defects, characterized in that, The method includes: A feature extraction operation is performed on the image of the wafer grains to be tested to obtain the first image features; Based on the first image features, the reference image features, and the first query information, the detection result of the wafer to be tested is generated through the trained language model. The reference image features are features obtained by feature extraction from the reference image. The reference wafer in the reference image does not have any defective parts. The first query information is used to guide the generation of the detection result, and the detection result is used to describe the defective parts of the wafer to be tested.

2. The method as described in claim 1, characterized in that, The method further includes: A reference image is compared with the image to be tested to determine the region of interest in the image to be tested, wherein the difference between the pixel value of at least one pixel in the region of interest and the pixel value of the corresponding pixel in the reference image is higher than a preset difference. The feature points corresponding to the region of interest in the first image feature are sampled to obtain the first defect feature; The step of generating the detection result of the wafer grains to be tested based on the first image features, the reference image features, and the first query information, through a trained language model, includes: Based on the first image features, the reference image features, the first query information, and the first defect features, the detection results of the wafer grains to be tested are generated through the trained language model.

3. The method as described in claim 1, characterized in that, The trained language model is obtained through the following steps: Feature extraction is performed on the training images of the training wafer dies to obtain the second image features; Based on the second image features, the reference image features, and the second query information, a prediction result for the training wafer is generated using an untrained language model. The second query information is used to guide the generation of the prediction result, and the prediction result is used to describe the defective parts of the training wafer. Based on the difference between the prediction results and the actual defect information of the training wafer, the model parameters of the untrained language model are adjusted.

4. The method as described in claim 3, characterized in that, The defective portion of the training wafer corresponds to a defective region in the training image. The step of generating a prediction result for the training wafer based on the second image features, the baseline image features, and the second query information, using an incompletely trained language model, includes: The feature points corresponding to the defect region in the second image feature are sampled to obtain the second defect feature; Based on the actual defect information, the defect description features are determined; The second defect feature and the defect description feature are fused together to obtain a fused feature. Based on the second image features, the reference image features, the second query information, and the fused features, the prediction results for the training wafer granules are generated using an untrained language model.

5. The method as described in claim 4, characterized in that, The defective region is obtained through any of the following steps: Using the defect pixel corresponding to the defect portion of the training wafer in the training image as the center, and drawing a circle with a preset radius, the defect region is obtained. The bounding rectangle of the defect pixels corresponding to the defective portion of the training wafer in the training image is determined as the defect region; The defective pixels corresponding to the defective portions of the training wafer in the training image are directly used as pixels within the defective region.

6. The method as described in claim 3, characterized in that, The second query information is used to query the location of the defective portion of the first wafer in the training image, and the true defect information is used to describe that the first wafer does not have a defective portion, wherein the first wafer is a training wafer without a defective portion; and / or The second query information is used to query whether the defective portion of the training wafer is correctly positioned in the training image. The actual defect information is used to describe whether the first position is incorrect and the defective position of the training wafer in the training image, wherein the first position is the position of the normal portion of the training wafer in the training image; and / or The second query information is used to query the first method of the defective portion of the training wafer, and the real defect information is used to describe the first method as incorrect and the correct method of the defective portion of the training wafer, wherein the first method is an incorrect handling method for the defective portion, and the correct method is a correct handling method for the defective portion.

7. The method as described in claim 3, characterized in that, The second query information is used to query the defect type of the defective portion of the training wafer, and the actual defect information is used to describe the defect type of the defective portion of the training wafer; and / or The second query information is used to query the defect location of the defect portion of the training wafer in the training image, and the real defect information is used to describe the defect location of the defect portion of the training wafer in the training image. and / or The second query information is used to query the positional relationship between the defective portion of the training wafer and the functional region in the training wafer; the actual defect information is used to describe the positional relationship between the defective portion of the training wafer and the functional region in the training wafer; and / or The second query information is used to query the visual characteristics of the defective portion of the training wafer, and the actual defect information is used to describe the visual characteristics of the defective portion of the training wafer.

8. The method as described in claim 3, characterized in that, The second query information is used to query the defect information of the training wafer in a preset area, and the actual defect information is used to describe the defect information of the training wafer in the preset area; and / or The second query information includes defect information. The second query information is used to query the position of the portion of the training wafer that matches the defect information in the training image. The real defect information is used to describe the position of the portion of the training wafer that matches the defect information in the training image.

9. The method as described in claim 3, characterized in that, The second query information is used to query the defect type and cause of the defective portion of the training wafer, and the actual defect information is used to describe the defect type and cause of the defective portion of the training wafer; and / or The second query information is used to query the processing method of the defective part of the training wafer, and the real defect information is used to describe the processing method of the defective part of the training wafer. and / or The second query information is used to query wafer processing improvement suggestions for the defective portion of the training wafer, and the actual defect information is used to describe the wafer processing improvement suggestions for the defective portion of the training wafer.

10. A defect detection device, characterized in that, The device includes: The first image feature extraction module is used to perform feature extraction operations on the image of the wafer grains to be tested in order to obtain the first image features; The detection result generation module is used to generate the detection result of the wafer to be tested based on the first image features, the reference image features, and the first query information, through a trained language model. The reference image features are features obtained by feature extraction from the reference image, and the reference wafer in the reference image does not have defective parts. The first query information is used to guide the generation of the detection result, and the detection result is used to describe the defects of the wafer to be tested.

11. An electronic device, characterized in that, The device includes a memory and a processor, wherein: the memory is used to store a computer program; and the processor is used to execute the computer program to implement the defect detection method as described in any one of claims 1-9.

12. A storage medium storing a computer program / instruction, characterized in that, The computer program / instructions are used to perform the defect detection method as described in any one of claims 1-9 when the program is executed.

13. A computer program product comprising computer program instructions, characterized in that, The computer program instructions, when executed, are used to perform the defect detection method as described in any one of claims 1-9.