Prediction method, prediction device, and computer readable storage medium
By using machine learning models to process wafer acceptance testing and wafer probing data, wafer yield can be predicted, solving the problem of time-consuming and labor-intensive testing in existing technologies and achieving efficient wafer testing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NUVOTON
- Filing Date
- 2025-06-05
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies for testing wafers are time-consuming and labor-intensive, especially when testing each individual die, resulting in a waste of testing time and human resources.
By employing machine learning models that combine Wafer Acceptance Test (WAT) and Wafer Probe Test (CP) sample data, and through data processing during training and prediction, wafer yield can be predicted, reducing actual testing operations.
By using machine learning models to predict wafer yield, testing time and manpower consumption can be significantly reduced, thereby improving testing efficiency.
Smart Images

Figure CN122198300A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a prediction method, and more particularly to a prediction method for wafer yield. Background Technology
[0002] After wafer manufacturing is complete, the wafer manufacturer supplies the entire batch of wafers to the testing company. The testing company performs tests on all dies on each wafer to determine the wafer's yield. However, the testing process involves many tests. Testing each die individually would be extremely time-consuming and labor-intensive. Summary of the Invention
[0003] This invention provides a prediction apparatus for predicting the yield of a batch of wafers, comprising a storage circuit, an input / output interface, and a processing circuit. The storage circuit stores a machine learning model. The input / output interface receives multiple Wafer Acceptance Test (WAT) sample data and multiple Wafer Probe (CP) sample data. The processing circuit reads from the storage circuit to load the machine learning model. During a training period, the processing circuit provides WAT sample data and CP sample data to the machine learning model. The machine learning model calculates a correlation between the WAT sample data and the CP sample data. During a prediction period, the input / output interface receives WAT measurement data, and the processing circuit inputs the WAT measurement data to the machine learning model. The machine learning model predicts the yield of the batch of wafers based on the correlation and the WAT measurement data.
[0004] This invention further provides a prediction method for predicting the yield of a batch of wafers. During a training period, multiple Wafer Acceptance Test (WAT) sample data from multiple wafers are received; a Wafer Probe Test (CP) is performed on the wafers to generate multiple CP sample data; and the WAT sample data and CP sample data are input into a machine learning model. The machine learning model calculates a correlation between the multiple WAT sample data and the multiple CP sample data. During a prediction period, WAT measurement data is input into the machine learning model. Based on the correlation and the WAT measurement data, the machine learning model predicts the yield of the batch of wafers.
[0005] The prediction method of the present invention can be implemented by the prediction device of the present invention, which is hardware or firmware capable of performing specific functions, or it can be implemented by code recorded in a recording medium and combined with specific hardware. When the code is loaded and executed by an electronic device, processor, computer or machine, the electronic device, processor, computer or machine becomes a prediction device for implementing the present invention. Attached Figure Description
[0006] Figure 1This is a flowchart illustrating the prediction method of the present invention.
[0007] Figure 2 This is a schematic diagram of the prediction device of the present invention.
[0008] Attached icon number
[0009] 111~113, 121~123: Steps
[0010] 110: During training
[0011] 120: Forecast period
[0012] 200: Predictive device
[0013] 210: Processing circuit
[0014] 220: Storage circuit
[0015] 230, 240: Input / output interfaces
[0016] ML: Machine Learning Model
[0017] DTR: Training Data
[0018] SD_W1~SD_Wn: WAT sample data
[0019] SD_C1~SD_Cn: CP sample data
[0020] CORR: Correlation
[0021] PR: Production line data
[0022] NM: Product Data
[0023] DIN: Input data
[0024] YL: Yield Rate
[0025] SD_M: WAT measurement data Detailed Implementation
[0026] To make the objectives, features, and advantages of this invention more apparent and understandable, embodiments are provided below, along with detailed descriptions in conjunction with the accompanying drawings. This specification provides different embodiments to illustrate the technical features of different implementations of the invention. The configuration of the elements in the embodiments is for illustrative purposes only and is not intended to limit the invention. Furthermore, the repetition of some reference numerals in the embodiments is for simplification and does not imply any correlation between different embodiments.
[0027] Figure 1This is a schematic flowchart of the prediction method of the present invention. The prediction method of the present invention is used to predict the yield of a batch of wafers. The batch of wafers has multiple wafers. For example, each batch of wafers may have 25 wafers, and each wafer has multiple dies. In one possible embodiment, the prediction method of the present invention can predict the die yield of each wafer. The prediction method of the present invention can exist in code. The code may be stored in a computer-readable storage medium. When the code on the computer-readable storage medium is loaded into and executed by a machine, the machine becomes a prediction apparatus for implementing the present invention.
[0028] During training 110, multiple wafer acceptance test (WAT) data are received (step 111). In one possible embodiment, the WAT data is provided by the wafer foundry. The wafer foundry tests each batch of wafers to generate a WAT data set.
[0029] For example, suppose each batch of wafers has 25 wafers, each wafer has 6 test points, and the wafer manufacturer performs 100 tests on each test point. In this example, each WAT data record contains the test results of 25 wafers, meaning each WAT data record contains 25 * 6 * 100 test results. In this embodiment, the multiple WAT data records in step 111 represent the test results of multiple batches of wafers. For ease of explanation, the WAT data recorded during training (110) is referred to as WAT sample data.
[0030] In some embodiments, the manufacturing specifications of each batch of wafers may differ. For example, the first batch of wafers may come from a first process route, which may be used to produce 55-nanometer wafers, and the second batch of wafers may come from a second process route, which may be used to produce 65-nanometer wafers. In this example, each WAT data record records the process route data for that batch of wafers.
[0031] During training 110, a chipprobing (CP) test is performed on multiple wafers recorded in the WAT sample data to generate multiple CP sample data (step 112). In one possible embodiment, if the first WAT sample data is the test result of the first batch of wafers, then step 112 performs a CP test on each wafer of the first batch of wafers. Since the first batch of wafers includes multiple wafers, step 112 generates multiple CP sample data after completing the CP test.
[0032] In one possible embodiment, the CP testing operation involves performing multiple electrical tests on each die of each wafer. Assume the CP testing operation includes 100 test items, each wafer has 1000 dies, and each batch has 25 wafers. After the CP testing operation, a data log is generated. Each data log includes the electrical test results for each wafer in each batch. For example, each data log might include 100*1000*25 electrical test results. In this example, each data log serves as a CP sample data.
[0033] During training 110, WAT sample data and CP sample data are input into a machine learning model (step 113). In one possible embodiment, the machine learning model calculates the plurality of WAT sample data and the plurality of CP sample data to generate a correction. For example, this correction may include the correlation between a specific parameter (e.g., Rs_P2) in the WAT sample data and a test item (e.g., RC22M) in the CP sample data. The larger the specific parameter (e.g., Rs_P2), the more likely the test item (e.g., RC22M) is to fail, such as if the test result is not within a normal range.
[0034] In some embodiments, the machine learning model generates a correlation based on abnormal electrical test results from WAT sample data and CP sample data. For example, suppose a specific test item (e.g., RC22M) in a portion of the CP sample data shows an abnormal result. In this example, the machine learning model determines which test result in the WAT sample data is correlated with the abnormal result of the specific test item. Suppose a specific parameter (e.g., Rs_P2) in the WAT sample data is correlated with the abnormal test result of a specific test item in the CP sample data. In this example, the machine learning model establishes a correlation between a specific parameter (e.g., Rs_P2) in the WAT sample data and a specific test item (e.g., RC22M) in the CP sample data.
[0035] Next, during a prediction period 120, WAT data is received (step 121). In one possible embodiment, upon receiving a batch of wafers from the wafer manufacturer, the wafer manufacturer also provides the WAT data for that batch of wafers. For ease of explanation, the WAT data during the prediction period is referred to as WAT measurement data. In one possible embodiment, the WAT measurement data and the WAT sample data are provided by the same wafer manufacturer.
[0036] Input WAT measurement data into the trained machine learning model (step 122). In some embodiments, it is assumed that during training period 110, step 113 trains the corresponding machine learning model based on the production line data recorded by the WAT sample data. When the production line data recorded by a first WAT sample data is related to the 55nm process, step 113 inputs the first WAT sample data and the corresponding first CP sample data into a first machine learning model. When the production line data recorded by a second WAT sample data is related to the 65nm process, step 113 inputs the second WAT sample data and the corresponding second CP sample data into a second machine learning model. When the production line data recorded by a third WAT sample data is related to the 55nm process, step 113 inputs the third WAT sample data and the corresponding third CP sample data into the first machine learning model. In this example, step 113 trains both the first and second machine learning models. Therefore, during prediction period 120, step 122 selects the corresponding machine learning model based on the production line data recorded by the WAT measurement data. For example, when the production line data recorded by the WAT measurement data is related to the 55-nanometer process, step 122 selects the first machine learning model and inputs the WAT measurement data into the first machine learning model.
[0037] Next, a machine learning model is used to calculate the correlation and WAT measurement data to predict the yield of the batch of wafers recorded in the WAT measurement data (step 123). Since the correlation represents the relationship and strength between the test items of the WAT sample data and the test items of the CP sample data, the yield of the wafers can be predicted based on the correlation.
[0038] For example, suppose the correlation indicates that when a specific parameter (e.g., Rs_P2) in the WAT sample data exceeds a critical value, the corresponding wafer fails a specific test item (e.g., RC22M) in the CP testing operation. In this example, when the machine learning model learns that a specific parameter (e.g., Rs_P2) in the WAT measurement data exceeds a critical value, step 123 directly marks the specific test item of the corresponding wafer as failed. Since the wafer does not actually need to undergo CP testing, testing time and manpower can be significantly reduced.
[0039] In other embodiments, step 123 further predicts the location of defective dies. For example, after performing a CP test, in addition to generating CP sample data, a CP map is also generated. This CP map shows the location of defective dies. Therefore, during training 110, if the CP map is also input into the machine learning model, the correlation generated by the machine learning model includes location information. During prediction 120, step 123 predicts not only the yield of each wafer in the batch but also the location of defective dies on each wafer.
[0040] In some embodiments, the CP testing operation includes multiple test items, a first portion of which corresponds to a first product, and a second portion of which corresponds to a second product. The present invention does not limit the types of the first and second products. In one possible embodiment, both the first and second products are functional circuits used to provide a preset function. Taking the first product as an example, the first product may be a conversion circuit (such as an ADC or DAC) providing a conversion function. In another possible embodiment, the first product is a control circuit (such as a DMA controller) used to provide a control function. In some embodiments, step 123 further predicts the yield of the first and second products in a batch of wafers.
[0041] Figure 2 This is a schematic diagram of the prediction device of the present invention. As shown, the prediction device 200 is used to predict the yield of a batch of wafers and includes a processing circuit 210 and a storage circuit 220. The storage circuit 220 stores a machine learning model (ML). In one possible embodiment, the storage circuit 220 has a non-volatile memory for storing the machine learning model ML.
[0042] Processing circuit 210 reads from storage circuit 220 to load machine learning model ML. In this embodiment, processing circuit 210 trains machine learning model ML using training data DTR. Training data DTR includes multiple WAT sample data SD_W1~SD_Wn and multiple CP sample data SD_C1~SD_Cn. During a training period, processing circuit 210 loads machine learning model ML and inputs WAT sample data SD_W1~SD_Wn and multiple CP sample data SD_C1~SD_Cn into machine learning model ML to train it. Machine learning model ML calculates the WAT sample data SD_W1~SD_Wn and the CP sample data SD_C1~SD_Cn to generate a correlation coefficient of return (CORR). In a possible embodiment, processing circuit 210 writes the trained machine learning model ML and the CORR into storage circuit 220.
[0043] In other embodiments, the training data DTR further includes production line data PR. The processing circuit 210 selects different machine learning models based on different production line data PRs. In this example, the storage circuit 220 has multiple machine learning models for predicting wafer yield for different production line data PRs. In some embodiments, the production line data PRs are recorded in WAT sample data SD_W1 to SD_Wn.
[0044] In one possible embodiment, the training data DTR further includes product data NM, such as product name (IP name). Product data NM may be recorded in CP sample data SD_C1 to SD_Cn. In this example, CP sample data SD_C1 to SD_Cn represent multiple electrical test results from different batches of wafers. A portion of these electrical test results pertains to a first product (e.g., an ADC circuit), while another portion pertains to a second product (e.g., a DAC circuit). In this example, each of the CP sample data SD_C1 to SD_Cn records both first product data and second product data. The first product data corresponds to the first product (e.g., an ADC circuit). The second product data corresponds to the second product (e.g., a DAC circuit).
[0045] In other embodiments, the prediction device 200 further includes an input / output interface 230. The input / output interface 230 receives training data DTR and provides the training data DTR to the processing circuitry 210. In one possible embodiment, the WAT sample data SD_W1 to SD_Wn are provided by the wafer manufacturer. In this example, the wafer manufacturer provides the WAT test data for each batch of wafers to the backend testing vendor. In this embodiment, the WAT sample data SD_W1 to SD_Wn are WAT test data from different batches of wafers. The backend testing vendor performs CP testing operations on different batches of wafers to generate CP sample data SD_C1 to SD_Cn.
[0046] During a prediction period, processing circuitry 210 provides input data DIN to a trained machine learning model ML. The machine learning model ML predicts the yield YL of a batch of wafers based on the correlation coefficient CORR and the input data DIN. In this embodiment, the input data DIN includes WAT measurement data SD_M. The WAT measurement data SD_M is similar to the WAT sample data SD_W1 to SD_Wn, all provided by the wafer manufacturer. In this example, the wafer manufacturer tests a specific batch of wafers to generate the WAT measurement data SD_M. The machine learning model ML predicts the yield of this specific batch of wafers based on the correlation coefficient CORR and the WAT measurement data SD_M.
[0047] In some embodiments, the input / output interface 230 further receives input data DIN and provides the input data DIN to the processing circuitry 210. In other embodiments, the prediction device 200 further includes an input / output interface 240 for outputting the predicted yield rate YL. In one possible embodiment, the processing circuitry 210 uses the input / output interface 230 to output the predicted yield rate YL.
[0048] In some embodiments, during training, the input / output interface 230 further receives a CP map SD_MP. In this example, the processing circuit 210 provides the CP map SD_MP, WAT sample data SD_W1~SD_Wn, and CP sample data SD_C1~SD_Cn to the machine learning model ML for training the machine learning model ML. During prediction, the machine learning model ML predicts not only the wafer yield YL, but also the location LOC of the defective die for each wafer. In one possible embodiment, the processing circuit 210 writes the prediction results (Location LOC of the defective die) generated by the machine learning model ML to the storage circuit 220.
[0049] In some embodiments, the machine learning model (ML) further predicts the yield of a specific product for each wafer in a batch of wafers. This specific product is a functional circuit, such as an ADC, DAC, etc. Each die of each wafer has at least one functional circuit. In this example, the machine learning model (ML) further predicts the yield of at least one functional circuit for each die.
[0050] The prediction method, or a specific form or part thereof, of the present invention may exist in the form of code. The code may be stored on physical media, such as floppy disks, optical disks, hard disks, or any other machine-readable (e.g., computer-readable) storage media, or may be a computer program product, not limited to an external form. When the code is loaded and executed by a machine, such as a computer, that machine becomes the prediction apparatus for the present invention. The code may also be transmitted via some transmission medium, such as wires or cables, optical fibers, or any transmission method. When the code is received, loaded, and executed by a machine, such as a computer, that machine becomes the prediction apparatus for the present invention. When implemented in a general-purpose processing unit, the code, combined with the processing unit, provides a unique apparatus that operates similarly to an application-specific logic circuit.
[0051] Unless otherwise defined, all terms herein (including technical and scientific terms) are as commonly understood by one of ordinary skill in the art to which this invention pertains. Furthermore, unless expressly stated otherwise, definitions of terms in general dictionaries should be interpreted as consistent with their meaning in the context of their respective technical fields, and not as idealized or overly formal expressions. While terms such as “first” and “second” may be used to describe various elements, these elements should not be limited by these terms. These terms are merely used to distinguish one element from another. In the claims, terms such as “first” and “second” are used as designations and are not intended to impose numerical requirements on their contents.
[0052] While the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the invention. Those skilled in the art can make modifications and refinements without departing from the spirit and scope of the invention. For example, the systems, apparatus, or methods described in the embodiments of the present invention can be implemented in physical embodiments of hardware, software, or a combination of hardware and software. Therefore, the scope of protection of the present invention shall be determined by the scope defined in the appended claims.
Claims
1. A prediction method, characterized in that, Used to predict the yield of a batch of wafers, and includes: During a training session: Receive multiple wafer acceptance test (WAT) sample data from multiple wafers; A single-chip probe CP test is performed on the multiple wafers to generate multiple CP sample data. Input the plurality of WAT sample data and the plurality of CP sample data into a machine learning model, wherein the machine learning model calculates the plurality of WAT sample data and the plurality of CP sample data to generate a correlation; During a forecast period: Input WAT measurement data into the machine learning model, whereby the machine learning model predicts the yield of the batch of wafers based on the correlation and the WAT measurement data.
2. The prediction method as described in claim 1, characterized in that, Based on the correlation and the WAT measurement data, the machine learning model predicts the location of defective dies in each wafer in the batch.
3. The prediction method as described in claim 2, characterized in that, The multiple CP sample data are the electrical test results of the multiple wafers.
4. The prediction method as described in claim 2, characterized in that, The machine learning model calculates the abnormal electrical test results of the multiple WAT sample data and the multiple CP sample data to generate the correlation.
5. The prediction method as described in claim 4, characterized in that, The CP testing operation includes multiple test items, a first part of which corresponds to a first product, and a second part of which corresponds to a second product.
6. The prediction method as described in claim 5, characterized in that, Based on the correlation and the WAT measurement data, the machine learning model predicts the yield of the first and second products in the batch of wafers.
7. A prediction device for predicting the yield of a batch of wafers, characterized in that, And includes: A storage circuit stores a machine learning model; One input / output interface is used to receive multiple wafer acceptance test (WAT) sample data and multiple wafer probe test (CP) sample data; A processing circuit reads the storage circuit to retrieve the machine learning model. in: During a training session: The processing circuit provides the plurality of WAT sample data and the plurality of CP sample data to the machine learning model, which calculates the plurality of WAT sample data and the plurality of CP sample data to generate a correlation; During a forecast period: The input / output interface receives WAT measurement data, and the processing circuit inputs the WAT measurement data to the machine learning model, wherein the machine learning model predicts the yield of the batch of wafers based on the correlation and the WAT measurement data.
8. The prediction device as claimed in claim 7, characterized in that, The machine learning model predicts the location of defective dies in each wafer in the batch based on the correlation and the WAT measurement data, and generates a prediction result. The processing circuit writes the prediction result into the storage circuit.
9. The prediction device as claimed in claim 7, characterized in that, The machine learning model predicts the yield of a product in the batch of wafers based on the correlation and the WAT measurement data, each of the plurality of CP sample data including a plurality of test results, at least one of the plurality of test results being for the product.
10. A computer-readable storage medium for storing code, characterized in that, When this code is executed, it performs the following steps: During a training session: Receive multiple wafer acceptance test (WAT) sample data from multiple wafers; A single-chip probe CP test is performed on the multiple wafers to generate multiple CP sample data. Input the plurality of WAT sample data and the plurality of CP sample data into a machine learning model, wherein the machine learning model calculates the plurality of WAT sample data and the plurality of CP sample data to generate a correlation; During a forecast period: Input WAT measurement data into the machine learning model, wherein the machine learning model predicts the yield of the plurality of wafers based on the correlation and the WAT measurement data.