Information processing method and information processing apparatus
By acquiring and segmenting time-series data in semiconductor manufacturing processes, calculating statistical values, and generating models, the problem of low feature extraction accuracy in existing technologies is solved, achieving higher-precision feature extraction and process status reflection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TOKYO ELECTRON LTD
- Filing Date
- 2021-06-18
- Publication Date
- 2026-06-26
AI Technical Summary
In semiconductor manufacturing processes, existing technologies struggle to accurately extract the features of time-series data sets during repeated processing, resulting in low accuracy in extracting processing status features and difficulty in determining the number of times the data has been processed. This necessitates significant time and expertise in setting up process-related parameters.
By acquiring time-series data of the substrate processing device, calculating the statistical values of the periodic processing, dividing the data into specified intervals, and performing multivariate analysis and machine learning based on the segmented data, a model is generated to improve the accuracy of feature extraction.
It improves the feature extraction accuracy of time series data sets measured during repeated processing, accurately reflects the processing status, simplifies process-related settings, and reduces reliance on human experience.
Smart Images

Figure CN115803849B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to information processing methods and information processing apparatus. Background Technology
[0002] In semiconductor device manufacturing processes, there is atomic layer deposition (ALD), which involves repeatedly depositing thin unit films, almost a monolayer, onto a substrate by switching between multiple process gases. There is also atomic layer etching (ALE), which involves repeatedly etching thin unit films, almost a monolayer, onto a substrate. Both ALD and ALE involve repeatedly performing the same process on a single substrate to achieve a defined process.
[0003] Patent Document 1: Japanese Patent Application Publication No. 2012-209593 Summary of the Invention
[0004] This disclosure provides an information processing method and an information processing apparatus that can improve the accuracy of feature extraction from time series data sets measured during repeated processing.
[0005] According to one aspect of the present disclosure, an information processing method acquires a set of time-series data measured during periodic processing of a substrate. The method calculates statistical values for each period of the periodic processing for each time-series data contained in the acquired set of time-series data. The method generates statistical data based on the calculated statistical values. The method divides the generated statistical data or time-series data into predetermined intervals. Based on the divided statistical data or time-series data, the method calculates a representative value for each interval.
[0006] According to this disclosure, the accuracy of feature extraction from time series data sets measured during repeated processing can be improved. Attached Figure Description
[0007] Figure 1 This is a block diagram illustrating an example of an information processing system in one embodiment of the present disclosure.
[0008] Figure 2 This is a block diagram illustrating an example of the hardware structure of an information processing apparatus in one embodiment of the present disclosure.
[0009] Figure 3 This is a functional block diagram illustrating an example of the functional structure of an information processing apparatus in one embodiment of the present disclosure.
[0010] Figure 4 This is a graph representing an example of time series data.
[0011] Figure 5 This is a graph representing an example of magnified time series data.
[0012] Figure 6 This is a graph illustrating an example of the calculation of statistical values from time series data.
[0013] Figure 7 This is a graph representing an example of an interval defined by Bayesian optimization.
[0014] Figure 8 This is a graph that illustrates an example of the relationship between representative values and measured data for an interval.
[0015] Figure 9 This is a graph representing an example of how representative values for a range of statistical data are derived from time series data.
[0016] Figure 10 This is a graph illustrating an example of a comparison between the detection of process anomalies and previous data.
[0017] Figure 11 This is a flowchart illustrating an example of the feature extraction process in this embodiment.
[0018] Figure 12 This is a flowchart illustrating an example of the prediction processing in this embodiment. Detailed Implementation
[0019] Hereinafter, embodiments of the disclosed information processing method and information processing apparatus will be described in detail based on the accompanying drawings. Furthermore, the disclosed technology is not limited by the following embodiments.
[0020] In processes involving repeated processing such as ALD and ALE, the time-series data representing process ticks becomes extremely abundant due to the repeated hundreds of cycles of process gas input, energy input such as heat, and process gas purging within a short period. Therefore, because the time-series data is repeated with extremely detailed cycles exhibiting similar tendencies, it is difficult to extract important characteristics that contribute to process defects and completion status, even when referenced unchanged. For example, in Patent Document 1, when repeatedly executing a sub-formula, feature quantities are extracted from the time-series data using data from a specific number of times the sub-formula was executed. However, the extracted feature quantities are not specific to the entire repeated processing. Therefore, it is difficult to extract feature quantities that accurately reflect the processing status when the same cycle is repeated hundreds of times. Furthermore, it is difficult to determine which processing data point to use for hundreds of repeated processes. In other words, due to the low accuracy of feature quantity extraction, extensive knowledge and time are required for process-related settings. Therefore, it is desirable to improve the accuracy of feature quantity extraction from time-series data sets measured during repeated processing.
[0021] [Structure of Information Processing System 1]
[0022] Figure 1 This is a block diagram illustrating an example of an information processing system in one embodiment of the present disclosure. Figure 1 The information processing system 1 shown includes a substrate processing apparatus 10, a result data acquisition apparatus 20, and an information processing apparatus 100. Furthermore, there may be multiple substrate processing apparatus 10, result data acquisition apparatus 20, and information processing apparatus 100.
[0023] The substrate processing apparatus 10 is, for example, a film deposition apparatus or an etching apparatus configured to perform ALD (Atomic Layer Deposition) or ALE (Atomic Layer Etching) processes on a substrate (semiconductor wafer, hereinafter referred to as a wafer) to be processed. The substrate processing apparatus 10 performs the wafer processing and sends the time-series data set measured in the process to the information processing apparatus 100.
[0024] The result data acquisition device 20 performs a prescribed inspection (e.g., film deposition speed) on the substrate that has finished processing in the substrate processing device 10 and acquires result data. The result data acquisition device 20 sends the acquired result data to the information processing device 100 as model creation data.
[0025] The information processing apparatus 100 receives time-series data sets from the substrate processing apparatus 10 and result data from the result data acquisition apparatus 20. Based on the received time-series data sets and other information, the information processing apparatus 100 extracts feature quantities and generates a model for outputting prediction results related to the process results. Furthermore, the information processing apparatus 100 receives new time-series data sets from the substrate processing apparatus 10 and, based on these new time-series data sets, outputs prediction results related to the process results in the substrate processing apparatus 10. These prediction results may include, for example, process anomaly detection information, wafer information, and various prediction information from the substrate processing apparatus.
[0026] [Hardware structure of information processing device 100]
[0027] Figure 2 This is a block diagram illustrating an example of the hardware structure of an information processing apparatus in one embodiment of the present disclosure. For example... Figure 2 As shown, the information processing device 100 includes a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, and a RAM (Random Access Memory) 103. Furthermore, the processor (processing circuit) such as the CPU 101 and the memory such as the ROM 102 and RAM 103 form what is called a computer.
[0028] Furthermore, the information processing device 100 includes: an auxiliary storage device 104, a display device 105, an operation device 106, an I / F (Interface) device 107, and a drive device 108. In addition, the various hardware components of the information processing device 100 are interconnected via a bus 109.
[0029] CPU 101 is a computing device that executes various programs (e.g., prediction programs) installed in auxiliary storage device 104.
[0030] ROM 102 is a non-volatile memory that functions as the main storage device. ROM 102 stores various programs and data required by CPU 101 to execute various programs installed on auxiliary storage device 104. Specifically, ROM 102 stores boot programs such as BIOS (Basic Input / Output System) and EFI (Extensible Firmware Interface).
[0031] RAM103 is a volatile memory such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory), and functions as the main storage device. RAM103 provides a working area for the CPU101 to execute various programs installed on the auxiliary storage device 104.
[0032] The auxiliary storage device 104 stores various programs and various data used by the CPU 101 when executing these programs. For example, the time-series data storage unit, which will be described later, is implemented in the auxiliary storage device 104.
[0033] Display device 105 is a display device that displays the internal status of information processing device 100. Operating device 106 is an input device used by the administrator of information processing device 100 to input various instructions to information processing device 100. I / F device 107 is a connection device that connects to a network (not shown) and is used for communication.
[0034] The drive unit 108 is a device for setting the recording medium 110. The recording medium 110 referred to here includes media that record information optically, electrically, or magnetically, such as CD-ROMs, floppy disks, and optical disks. In addition, the recording medium 110 may also include semiconductor memories that record information electrically, such as ROMs and flash memory.
[0035] Furthermore, various programs installed on the auxiliary storage device 104 can be set on the drive device 108 via the allocated recording medium 110, and the various programs recorded on the recording medium 110 can be read and installed by the drive device 108. Alternatively, various programs installed on the auxiliary storage device 104 can also be installed by downloading from a network (not shown).
[0036] [Functional Structure of Information Processing Device 100]
[0037] Figure 3 This is a functional block diagram illustrating an example of the functional structure of an information processing apparatus in one embodiment of the present disclosure. The information processing apparatus 100 includes a storage unit 220 and a control unit 230.
[0038] The storage unit 220 is implemented, for example, using a semiconductor memory element such as RAM 103, flash memory, or a storage device such as a hard disk or optical disk. The storage unit 220 includes a time-series data group storage unit 221 and a result data storage unit 222. Furthermore, the storage unit 220 stores information used in the processing within the control unit 230.
[0039] The time-series data group storage unit 221 stores each time-series data group measured during the process of periodically processing multiple wafers in the substrate processing apparatus 10. The time-series data group storage unit 221 stores, for example, information such as the voltage (RF Vpp) of the high-frequency power supply of the substrate processing apparatus 10 as the time-series data contained in the time-series data group. Figure 4 This is a graph representing an example of time series data. As an example of time series data, Figure 4 The chart 150 shown is a graph that plots the voltage of the high-frequency power supply corresponding to the time elapsed of the process, i.e., the cycle processing.
[0040] Figure 5 This is a graph representing an example of magnified time series data. Figure 5 The chart shown in Figure 151 is an enlarged version. Figure 4 A portion of the graph in Figure 150. As shown in Figure 151, it can be seen that the voltage of the high-frequency power supply repeatedly has a peak period. In addition, each time series data group corresponding to each wafer is stored in the time series data group storage unit 221 in a corresponding manner with the wafer No.
[0041] Return to Figure 3 The result data storage unit 222 stores result data related to the process results for each wafer. As result data, various measurement data, such as measurement data related to the completion status of the wafer after processes such as film thickness, can be used. The result data storage unit stores data input from the operation device 106 or the I / F device 107.
[0042] Storage unit 220 also stores statistical data, interval information, models, etc. Statistical data is data that arranges the statistical values of each period of the periodic processing calculated for each time series data in a time series manner. In other words, statistical data makes it easy to grasp the overall trend of the time series data. Interval information is information used to divide statistical data or time series data into specified intervals. In interval division, by adjusting the division method, the characteristics of the process can be accurately grasped. Furthermore, by using representative values of statistical data or time series data based on appropriately divided intervals, the accuracy of the model can be improved. The model is a model generated based on multivariate analysis or machine learning using statistical data or time series data. Result data can also be used in model generation. For example, the model can be generated using the 3σ Mahalanobis distance based on the normal distribution of the data. For example, in the case of anomaly detection, a model can be used that detects anomalies if the Mahalanobis distance continuously exceeds a threshold. Alternatively, other models such as linear regression models generated using PLS (Partial Least Squares) regression can also be used.
[0043] The control unit 230 can be implemented, for example, by using a CPU 101, MPU (Micro Processing Unit), GPU (Graphics Processing Unit), or the like, to execute programs stored in the internal memory device using RAM 103 as a working area. Alternatively, the control unit 230 can also be implemented using integrated circuits such as ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array).
[0044] The control unit 230 includes an acquisition unit 231, a first calculation unit 232, a first generation unit 233, a segmentation unit 234, a second calculation unit 235, a second generation unit 236, and a prediction unit 237, and performs or executes the information processing functions described below. Furthermore, the internal structure of the control unit 230 is not limited to... Figure 3 The structure shown can be any other structure, as long as it is the structure for information processing described later.
[0045] In the case of feature extraction processing, the acquisition unit 231 acquires each time series data set corresponding to each wafer from the substrate processing apparatus 10. Alternatively, the acquisition unit 231 can also acquire result data of substrate process treatments, such as inspection data, from the result data acquisition apparatus 20. Furthermore, in the case of prediction processing, the acquisition unit 231 acquires the time series data set corresponding to the new wafer to be predicted from the substrate processing apparatus 10. The acquisition unit 231 stores the acquired time series data sets in the time series data set storage unit 221 and stores the acquired result data in the result data storage unit 222.
[0046] The first calculation unit 232, referring to the time series data group storage unit 221, calculates the statistical values for each period of the periodic processing for each time series data contained in the time series data group. The statistical values can be, for example, values such as average, minimum, maximum, variance, and slope. The first calculation unit 232 outputs the set of calculated statistical values for each time series data to the first generation unit 233. Furthermore, in the feature extraction processing, the statistical values for each time series data are calculated similarly for each of the multiple wafer time series data groups. In the following description, when processing is performed on each time series data within multiple wafer time series data groups or on each time series data within a single wafer time series data group in each processing unit, a representative description of one time series data is given, and descriptions of other time series data are omitted. Here, using... Figure 6 Explain the calculation of the statistical values.
[0047] Figure 6 This is a graph illustrating an example of the calculation of statistical values from time series data. For example... Figure 6 As shown, the first calculation unit 232 extracts a specific period 152 from the chart 150 of the time series data, for example. For the extracted period 152, the first calculation unit 232 calculates, for example, values such as the maximum value 152a, the median value 152b, the average value 152c, and the minimum value 152d as statistical values.
[0048] Furthermore, the first calculation unit 232 can also remove data excluded from the calculation of statistical values for each period contained in the time series data. The excluded data can be, for example, data such as the switching portion of steps within the period. For instance, for period 152, the first calculation unit 232 can also extract only the second step interval 152-2 at the step switching time, excluding other data, and calculate the statistical value.
[0049] Return to Figure 3 The explanation is as follows: If a set of statistical values for each time series data is input into the first calculation unit 232, the first generation unit 233 generates statistical data for each time series data based on the set of statistical values. For example, the first generation unit 233 generates statistical data corresponding to the time series data by arranging the statistical values according to the time series. The first generation unit 233 outputs the generated statistical data to the segmentation unit 234. Furthermore, the first generation unit 233 may also be integrated with the first calculation unit 232.
[0050] If statistical data is input from the first generation unit 233, the segmentation unit 234 divides the input statistical data into one or more intervals. Furthermore, to avoid a decrease in accuracy due to repeated statistical processing, the segmentation unit 234 may also refer to the time series data group storage unit 221 to divide the time series data contained in the time series data group into one or more intervals. In the case of feature extraction processing, the segmentation unit 234 divides the statistical data or time series data into intervals based on a predetermined interval segmentation method. Additionally, if the second generation unit 236 instructs a change in the interval segmentation method, the segmentation unit 234 may, for example, change the segmentation ratio, the number of segments, etc., to divide the statistical data or time series data into intervals. For example, the segmentation unit 234 may divide the statistical data or time series data into two intervals in the first half of the process, into one interval in the middle, and into two intervals in the second half. In this case, for example, interval I1 is set as the first cycle of the process, interval I2 is set as the second to tenth cycle from the beginning, and interval I3 is set as the eleventh cycle from the beginning to the eleventh cycle from the end. In other words, interval I3 encompasses most of the hundreds of cycles constituting the process. Interval I4 is defined as the interval from the 2nd cycle to the 10th cycle from the end, and interval I5 is defined as the last cycle. By adjusting the segmentation method within such intervals, the characteristics of the process can be accurately grasped. In predictive processing, the segmentation unit 234 segments statistical data or time series data into intervals based on the interval information stored in the storage unit 220. The segmentation unit 234 outputs the segmented statistical data or time series data to the second calculation unit 235.
[0051] Furthermore, when the segmentation unit 234 uses the interval pre-calculated through Bayesian optimization as information for segmenting statistical data or time series data, it pre-calculates the interval with reference to the time series data group storage unit 221 and the result data storage unit 222. Here, using Figure 7 as well as Figure 8 The definition of the interval based on Bayesian optimization is explained.
[0052] Figure 7 This is a graph representing an example of a range defined through Bayesian optimization. For example... Figure 7As shown, time series data group 170 is associated with measurement data group 171, each containing the same wafer number. Specifically, time series data 170a, 170b, 170c, 170d, ... and measurement data 171a, 171b, 171c, 171d, ... are associated respectively. Time series data group 170 and measurement data group 171 also utilize data from multiple wafers, for example, data from three to dozens of wafers. The segmentation unit 234 performs Bayesian optimization using time series data group 170 as the description function and measurement data group 171 as the objective function.
[0053] The segmentation unit 234, for example, uses the range (interval) of the extracted period as a parameter to perform Bayesian optimization. The segmentation unit 234 calculates the representative value of the interval in the same way as the second calculation unit 235 described later. The segmentation unit 234, for example, uses the coefficient of determination R... 2 This is used to determine the relationship between the calculated representative value of the interval and the measured data. Furthermore, the coefficient of determination R... 2 Take the range from 0 to 1.
[0054] Figure 8 This is a graph illustrating an example of the relationship between representative values of an interval and measured data. In Figure 8 In the example shown in Chart 173, the coefficient of determination R is assumed to be... 2 It is 0.7955. In this case, the segmentation unit 234 searches by changing the parameters described above, for example, until the determination coefficient R is reached. 2 The calculation continues until the value reaches 0.8 or higher, or until a predetermined number of calculations or time is met. That is, the segmentation unit 234 sets a specified interval to reduce the model's prediction error. Furthermore, the relationship between the representative value of the interval and the measured data includes the coefficient of determination R... 2 In addition, RMSE (Root Mean Square Error) and PLS can also be used.
[0055] The segmentation unit 234 can obtain the Bayesian optimization result, for example... Figure 7 Interval 172, as shown, serves as interval information for segmenting statistical data or time series data. The segmentation unit 234 stores the calculated interval 172 in the storage unit 220. That is, in the example of interval 172, compared to the intervals I1 to I5 mentioned above, the number of prediction objects in the model can be reduced from five to one. In other words, by using Bayesian optimization, the search time can be shortened. Furthermore, the segmentation unit 234 can also use other parameter search methods instead of Bayesian optimization to find the intervals for segmenting statistical data or time series data.
[0056] Return to Figure 3The explanation is as follows: If segmented statistical data or time series data is input from the segmentation unit 234, the second calculation unit 235 calculates a representative value (summary) for each interval based on the segmented statistical data or time series data. The second calculation unit 235 calculates values such as the average, minimum, maximum, variance, and slope as the representative value for each interval. For example, when the statistical data or time series data is segmented into the aforementioned intervals I1 to I5, the second calculation unit 235 calculates the average value for each interval I1 to I5 as the representative value. The second calculation unit 235 outputs the calculated representative value for each interval to the second generation unit 236 in the feature extraction process and to the prediction unit 237 in the prediction process.
[0057] Here, use Figure 9 This section explains how to determine the representative value of an interval of statistical data based on time series data. Figure 9 This is a graph illustrating an example of how representative values for a range of statistical data are derived from time series data. For example... Figure 9 As shown, the information processing device 100 calculates statistical data 190 based on the statistical values of each period according to the chart 150 of the time series data. Next, the information processing device 100 calculates representative values 191 to 195, for example, for the intervals 181 to 185 corresponding to the intervals I1 to I5 mentioned above. Figure 9 In the example, within statistical data 190, the representative value 191 is lower than the other representative values 192-195. This indicates that the voltage of the high-frequency power supply in the first cycle of the process (i.e., the range 181 of representative value 191) is lower, resulting in poor plasma rise. In other words, the wafer with statistical data 190 was processed in a process with abnormal plasma rise, thus becoming a defective product, and therefore an anomaly is detected.
[0058] Return to Figure 3The second generation unit 236 inputs representative values for each interval from the second calculation unit 235 during feature extraction processing. The second generation unit 236 generates a model by performing multivariate analysis based on the representative values of each interval, which are based on statistical data or time series data. The model is, for example, a prediction function f(x). Furthermore, the prediction function f(x) may be, for example, a function using Mahalanobis distance, PLS regression, etc. Additionally, when using result data, the second generation unit 236, referring to the result data storage unit 222, generates a model by performing multivariate analysis based on the representative values of each interval, which are based on statistical data or time series data, and the result data. The second generation unit 236 inputs the representative values of each interval, which are feature quantities, as x into the generated model, i.e., the prediction function f(x), and calculates y = f(x). y represents the prediction result. The second generation unit 236 uses an evaluation function such as RMSE to determine whether the prediction accuracy is above a threshold for the prediction result. If the second generation unit 236 determines that the prediction accuracy is not above the threshold, it instructs the segmentation unit 234 to change the method of segmenting the intervals. If the second generation unit 236 determines that the prediction accuracy is above the threshold, it stores the interval information and the model in the storage unit 220.
[0059] In the prediction process, the prediction unit 237 inputs representative values for each interval from the second calculation unit 235. The prediction unit 237 inputs the representative values of each interval, which are feature quantities, as x into the model used for feature quantity extraction, i.e., the prediction function f(x), stored in the storage unit 220, to obtain the prediction result, i.e., y = f(x). The prediction unit 237 determines whether the prediction result is above or below a threshold. If the prediction unit 237 determines that the prediction result is above or below the threshold, it outputs the prediction result and executes a preset action, such as changing the setting value of the formula in the substrate processing apparatus 10, notifying the substrate processing apparatus 10 of an alarm, or sending an email to the operator. If the prediction unit 237 determines that the prediction result is not above or below the threshold, it outputs the prediction result and does not execute the preset action.
[0060] The prediction results include, based on the model used, information such as process anomaly detection information, prediction information related to process results, prediction information on the maintenance period of the substrate processing apparatus 10, correction information for the setpoints of the substrate processing apparatus 10, and correction information for the process setpoints. Additionally, as prediction results, information classifying process anomalies can also be output. Furthermore, the prediction results can be used for various purposes, such as storing them in the storage unit 220 for statistical processing or other processing, or sending them to the substrate processing apparatus 10 for settingpoint correction.
[0061] Here, use Figure 10 Here is an example of a prediction result. Figure 10This is a graph illustrating an example of a comparison between the detection of process anomalies and previous data. For example... Figure 10 As shown, for example, when using process anomaly detection information as the prediction result, compared with the summary 196 generated from the existing time series data as a whole, the second and seventh wafers of the prediction result generated in this embodiment, namely summary 197, deviate from the overall result, and anomalies can be detected.
[0062] [Feature Extraction Methods]
[0063] Next, the operation of the information processing apparatus 100 of this embodiment will be described. First, using Figure 11 The feature extraction process is explained. Figure 11 This is a flowchart illustrating an example of the feature extraction process in this embodiment. Furthermore, in Figure 11 In the following section, we will use the case of dividing statistical data into intervals as an example to illustrate the point.
[0064] The acquisition unit 231 of the information processing apparatus 100 acquires time-series data sets corresponding to each wafer from the substrate processing apparatus 10 (step S1). Furthermore, when using result data, the acquisition unit 231 acquires result data for each wafer from the result data acquisition apparatus 20. The acquisition unit 231 stores the acquired time-series data sets in the time-series data set storage unit 221 and stores the acquired result data in the result data storage unit 222.
[0065] The first calculation unit 232 refers to the time series data group storage unit 221 and calculates the statistical values of each period for each time series data contained in the time series data group (step S2). The first calculation unit 232 outputs the set of calculated statistical values of each time series data to the first generation unit 233.
[0066] If a set of statistical values for each time series data is input from the first calculation unit 232, the first generation unit 233 generates statistical data for each time series data based on the set of statistical values (step S3). The first generation unit 233 outputs the generated statistical data to the segmentation unit 234.
[0067] If statistical data is input from the first generation unit 233, the segmentation unit 234 divides the input statistical data into one or more intervals (step S4). The segmentation unit 234 outputs the segmented statistical data to the second calculation unit 235.
[0068] If segmented statistical data is input from the segmentation unit 234, the second calculation unit 235 calculates the representative value of each interval based on the segmented statistical data (step S5). The second calculation unit 235 outputs the calculated representative value of each interval to the second generation unit 236.
[0069] If a representative value for each interval is input from the second calculation unit 235, the second generation unit 236 performs multivariate analysis based on the representative value for each interval to generate a model (prediction function f(x)) (step S6). Furthermore, when using result data, the second generation unit 236 refers to the result data storage unit 222 and performs multivariate analysis based on the representative value for each interval and the result data to generate a model. The second generation unit 236 inputs the representative value (feature quantity x) for each interval into the generated model (f(x)) to obtain the prediction result. That is, it calculates y = f(x) (step S7).
[0070] The second generation unit 236 uses an evaluation function such as RMSE to determine whether the prediction accuracy is above a threshold (step S8). If the second generation unit 236 determines that the prediction accuracy is not above the threshold (step S8: No), it returns to step S4 and restarts from the interval segmentation. If the second generation unit 236 determines that the prediction accuracy is above the threshold (step S8: Yes), it stores the interval information and the model in the storage unit 220 and ends the feature extraction process. Therefore, the information processing apparatus 100 can improve the accuracy of feature extraction from time series data sets measured during repeated processing.
[0071] [Prediction Method]
[0072] Next, use Figure 12 The prediction process is explained. Figure 12 This is a flowchart illustrating an example of the prediction processing in this embodiment. Furthermore, in Figure 12 In the following section, we will use the case of dividing statistical data into intervals as an example to illustrate the point.
[0073] The acquisition unit 231 of the information processing apparatus 100 acquires a time-series data group corresponding to the wafer from the substrate processing apparatus 10 (step S11). The acquisition unit 231 stores the acquired time-series data group in the time-series data group storage unit 221.
[0074] Referring to the time series data group storage unit 221, the first calculation unit 232 calculates the statistical value of each period of the periodic processing for each time series data contained in the time series data group using the same method as when extracting the feature quantity (step S12). The first calculation unit 232 outputs the set of calculated statistical values of each time series data to the first generation unit 233.
[0075] If a set of statistical values for each time series data is input from the first calculation unit 232, the first generation unit 233 generates statistical data for each time series data based on the set of statistical values (step S13). The first generation unit 233 outputs the generated statistical data to the segmentation unit 234.
[0076] The segmentation unit 234 segments the input statistical data into the same intervals as when extracting the feature quantities (step S14). The segmentation unit 234 outputs the segmented statistical data to the second calculation unit 235.
[0077] If segmented statistical data is input from the segmentation unit 234, the second calculation unit 235 calculates the representative value of each interval based on the segmented statistical data using the same method as when extracting the feature quantity (step S15). The second calculation unit 235 outputs the calculated representative value (feature quantity x) of each interval to the prediction unit 237.
[0078] If a representative value for each interval is input from the second calculation unit 235, the prediction unit 237 inputs the representative value for each interval into the model used in feature extraction to obtain the prediction result (step S16). That is, the feature x is substituted into y = f(x). The prediction unit 237 determines whether the prediction result is above a threshold (step S17). If the prediction unit 237 determines that the prediction result is above the threshold (step S17: Yes), it executes a preset action (step S18) and ends the prediction process. In addition, the preset actions include changing the setting value of the formula in the substrate processing apparatus 10, notifying the substrate processing apparatus 10 of an alarm, and sending emails to the operator. On the other hand, if the prediction unit 237 determines that the prediction result is not above the threshold (step S17: No), it does not execute any special action and ends the prediction process. As a result, the information processing apparatus 100 can improve the accuracy of feature extraction of time series data groups measured during repeated processing, and use the prediction results for anomaly detection, prediction, etc.
[0079] According to this embodiment, the information processing apparatus 100 acquires a set of time-series data measured during the periodic processing of a substrate. Furthermore, the information processing apparatus 100 calculates statistical values for each period of the periodic processing for each time-series data contained in the acquired time-series data set. Additionally, the information processing apparatus 100 generates statistical data based on the calculated statistical values. Furthermore, the information processing apparatus 100 divides the generated statistical data or time-series data into predetermined intervals. Furthermore, the information processing apparatus 100 calculates a representative value for each interval based on the divided statistical data or time-series data. The calculated representative value represents a characteristic of the process during the periodic processing of the substrate. As a result, the accuracy of feature extraction from the time-series data set measured during repeated processing can be improved.
[0080] Furthermore, according to this embodiment, the information processing apparatus 100 also acquires result data related to the results of the substrate processing. Additionally, the information processing apparatus 100 generates a model based on the calculated representative values for each interval and the result data. As a result, a model can be generated that improves the accuracy of feature extraction from time-series data sets measured during repeated processing.
[0081] Furthermore, according to this embodiment, the information processing device 100 sets a predetermined interval to reduce the prediction error of the model. As a result, the accuracy of feature extraction from the time series data set can be further improved.
[0082] Furthermore, according to this embodiment, the information processing device 100 divides statistical data or time series data into at least three intervals: the first half, the middle half, and the last half. As a result, it is possible to extract feature values from the beginning and the end of the time series data with high precision.
[0083] Furthermore, according to this embodiment, the information processing device 100 obtains the interval through Bayesian optimization. As a result, the interval can be obtained without relying on previous insights.
[0084] Furthermore, according to this embodiment, the information processing apparatus 100 uses at least one of multivariate analysis or a neural network. As a result, the accuracy of feature extraction from time series data sets can be further improved.
[0085] Furthermore, according to this embodiment, the statistical value is any one of the average, minimum, maximum, variance, and slope for each period. As a result, since feature quantities can be extracted based on the characteristics of time series data, accuracy can be further improved.
[0086] Furthermore, according to this embodiment, the representative value is any one of the average, minimum, maximum, variance, and slope within a specified interval. As a result, since feature quantities can be extracted based on the characteristics of time series data, accuracy can be further improved.
[0087] Furthermore, according to this embodiment, the information processing apparatus 100 acquires a set of time-series data measured during the periodic processing of a new substrate. The information processing apparatus 100 also calculates statistical values for each period of the periodic processing for each time-series data contained in the acquired time-series data set. Additionally, the information processing apparatus 100 generates statistical data based on the calculated statistical values. Furthermore, the information processing apparatus 100 divides the generated statistical data or time-series data into predetermined intervals. Furthermore, the information processing apparatus 100 calculates a representative value for each interval based on the divided statistical data or time-series data. Finally, the information processing apparatus 100 inputs the calculated representative value for each interval into a model and outputs a prediction result. As a result, predictions can be made with higher accuracy.
[0088] Furthermore, according to this embodiment, the prediction result is one or more of the following: process anomaly detection information, prediction information related to process results, prediction information of the maintenance period of the substrate processing apparatus 10, correction information of the setting values of the substrate processing apparatus 10, and correction information of the process setting values. As a result, anomalies in the process can be detected. Furthermore, wafer processing plans can be easily established. Furthermore, the maintenance period of the substrate processing apparatus 10 can be easily determined. Furthermore, the setting values of the substrate processing apparatus 10 and the process can be corrected.
[0089] The embodiments disclosed herein should be considered illustrative in all respects and not restrictive. The above embodiments may also be omitted, substituted, or modified in various forms without departing from the appended claims and their spirit.
[0090] Furthermore, in the above embodiments, the voltage of the high-frequency power supply of the substrate processing apparatus 10 was listed as an example of time-series data, but it is not limited to this. For example, information related to the completion status of the wafer, such as the flow rate of the processing gas and the pressure inside the chamber, may also be set as time-series data.
[0091] Furthermore, while multivariate analysis is used to generate the model in the above embodiments, it is not a limitation. For example, in the case of anomaly detection, multiple statistical data, measurement data, and groups of abnormal or normal information can be used as training data. A learned model can be generated using machine learning such as CNN (Convolutional Neural Network), and the generated learned model can be used as a model to detect anomalies. Anomaly detection based on a trend chart focusing on a single measurement data point can also be combined.
[0092] Furthermore, in the above embodiments, the prescribed intervals for segmenting statistical data were described in terms of both pre-defined scenarios and scenarios obtained through Bayesian optimization, but this is not a limitation. For example, in various processes, the combination of statistical data and intervals obtained through Bayesian optimization can be used as training data, and a fully learned model can be generated using machine learning methods such as CNN. The generated fully learned model can then be used to determine the prescribed intervals in the statistical data for a new process.
[0093] Furthermore, in the above embodiments, data processing such as feature extraction processing and prediction processing is performed in the information processing apparatus 100 that acquires time-series data from the substrate processing apparatus 10, but it is not limited to this. For example, the above-mentioned feature extraction processing and prediction processing and other data processing can also be performed in the control unit of the substrate processing apparatus 10.
[0094] Furthermore, in the above embodiment, a semiconductor wafer was described as an example of the substrate to be processed in the substrate processing apparatus 10, but it is not limited to this. For example, time-series data may also be obtained from the substrate processing apparatus that processes substrates such as FPDs (Flat Panel Displays).
[0095] Furthermore, the various processing functions performed in each device can also be executed, in whole or in part, on a CPU (or a microcomputer such as an MPU or MCU). Alternatively, these processing functions can also be executed, in whole or in part, on a program parsed and executed by a CPU (or a microcomputer such as an MPU or MCU), or on hardware based on wiring logic.
[0096] Explanation of reference numerals in the attached figures
[0097] 1…Information processing system; 10…Substrate processing apparatus; 20…Result data acquisition apparatus; 100…Information processing apparatus; 220…Storage unit; 221…Time series data group storage unit; 222…Result data storage unit; 230…Control unit; 231…Acquisition unit; 232…First calculation unit; 233…First generation unit; 234…Segmentation unit; 235…Second calculation unit; 236…Second generation unit; 237…Prediction unit.
Claims
1. An information processing method, comprising: Acquire time-series data sets measured during the periodic processing of the substrate; For each time series data contained in the above-mentioned time series data set, calculate the statistical value of each period of the above-mentioned periodic processing. Generate statistical data based on the calculated statistical values mentioned above; The generated statistical data above is divided into specified intervals; as well as Based on the segmented statistical data above, a representative value is calculated for each of the above intervals. The aforementioned processing also yields result data related to the results of the process on the aforementioned substrate. The model is generated based on the calculated representative values for each of the aforementioned intervals and the resulting data.
2. The information processing method according to claim 1, wherein, The above segmentation process sets the specified intervals to reduce the prediction error of the model.
3. The information processing method according to claim 1 or 2, wherein, The above segmentation process divides the statistical data into at least the first half, the middle half, and the second half.
4. The information processing method according to any one of claims 1 to 3, wherein, The above segmentation process uses Bayesian optimization to obtain the aforementioned intervals.
5. The information processing method according to claim 1, wherein, The process of generating the above model uses at least one of multivariate analysis or neural networks.
6. The information processing method according to any one of claims 1 to 5, wherein, The above statistical values are any one of the average, minimum, maximum, variance, and slope for each of the above periods.
7. The information processing method according to any one of claims 1 to 6, wherein, The aforementioned representative value is any one of the average, minimum, maximum, variance, and slope within the specified interval.
8. An information processing method, comprising: Acquire a set of time-series data determined during the periodic processing of a new substrate; For each time series data contained in the above-mentioned time series data set, calculate the statistical value of each period of the above-mentioned periodic processing. Generate statistical data based on the calculated statistical values mentioned above; The generated statistical data or time series data are divided into specified intervals; Based on the segmented statistical data or time series data described above, a representative value is calculated for each of the above intervals. as well as Input the calculated representative value of each of the above intervals into the model and output the prediction result.
9. The information processing method according to claim 8, wherein, The above prediction results are one or more of the following: abnormal detection information of the process, prediction information related to the results of the above process, prediction information of the maintenance period of the substrate processing device, correction information of the setting value of the above substrate processing device, and correction information of the setting value of the above process.
10. An information processing apparatus, comprising: The acquisition unit acquires time-series data sets measured during the periodic processing of the substrate; The first computing unit calculates the statistical values for each period of the periodic processing for each time series data contained in the acquired time series data group. The first generation unit generates statistical data based on the calculated statistical values mentioned above; The segmentation section divides the generated statistical data into specified intervals; as well as The second calculation unit, based on the segmented statistical data, calculates a representative value for each of the aforementioned intervals. The aforementioned acquisition unit also acquires result data related to the results of the process on the aforementioned substrate. The aforementioned information processing apparatus has a second generation unit that generates a model based on the calculated representative value of each of the aforementioned intervals and the aforementioned result data.
11. The information processing apparatus according to claim 10, wherein, The aforementioned segmentation is set within the specified interval to reduce the prediction error of the model.
12. The information processing apparatus according to claim 10 or 11, wherein, The aforementioned segmentation divides the statistical data into at least the first half, the middle half, and the second half of each interval.
13. An information processing apparatus, comprising: The acquisition unit acquires a set of time-series data measured during the periodic processing of a substrate of a new target object. The first computing unit calculates the statistical values for each period of the periodic processing for each time series data contained in the acquired time series data group. The generation department generates statistical data based on the calculated statistical values mentioned above. The segmentation section divides the generated statistical data or time series data into specified intervals. The second calculation unit calculates a representative value for each of the aforementioned intervals based on the segmented statistical data or the aforementioned time series data; and The prediction unit inputs the calculated representative values for each of the aforementioned intervals into the model and outputs the prediction results.
14. The information processing apparatus according to claim 13, wherein, The above prediction results are one or more of the following: abnormal detection information of the process, prediction information related to the results of the above process, prediction information of the maintenance period of the substrate processing device, correction information of the setting value of the above substrate processing device, and correction information of the setting value of the above process.